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Abstract. We show that graphs generated by collapsible pushdown systems of level 2 are 
tree-automatic. Even if we allow e-contractions and reachability predicates (with regular 
constraints) for pairs of configurations, the structures remain tree-automatic whence their 
first-order logic theories are decidable. As a corollary we obtain the tree-automaticity of 
the second level of the Caucal-hierarchy. 
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1. Introduction 

Higher-order pushdown systems were first introduced by Maslov [14] [15] as accepting devices 
for word languages. Later, Knapik et al. |13] studied them as generators for trees. They 
obtained an equi-expressivity result for higher-order pushdown systems and for higher-order 
recursion schemes that satisfy the constraint of safety, which is a rather unnatural syntactic 
condition. Hague et al. |10| introduced collapsible pushdown systems as extensions of higher- 
order pushdown systems and proved that these have exactly the same power as higher-order 
recursion schemes as methods for generating trees. 

Both higher-order and collapsible pushdown systems also form interesting devices for 
generating graphs. Carayol and Wohrle [6] showed that the graphs generated by higher- 
order pushdown systems of level I coincide with the graphs in the Z-th level of the Caucal- 
hierarchy, a class of graphs introduced by Caucal [7J. Every level of this hierarchy is 
obtained from the preceding level by applying graph unfoldings and monadic second-order 
interpretations. Both operations preserve the decidability of the monadic second-order 
theory whence the Caucal-hierarchy forms a large class of graphs with decidable monadic 
second-order theories. If we use collapsible pushdown systems as generators for graphs we 
obtain a different situation. Hague et al. showed that even the second level of the hierarchy 
contains a graph with undecidable monadic second-order theory. Furthermore, they showed 
the decidability of the modal /i-calculus theories of all graphs in the hierarchy. These results 
turn graphs generated by collapsible pushdown systems into an interesting class. The author 
only knows one further natural class of graphs which shares these two properties, viz. the 
class of nested pushdown trees (cf. p~]). Moreover this class can be seen as a subclass of 
that of collapsible pushdown graphs (cf. [TT]). 

This paper is the long version of |12] and studies the first-order model-checking prob- 
lem on collapsible pushdown graphs. We show that the graphs in the second level of the 
collapsible pushdown hierarchy are tree-automatic. Tree-automatic structures were intro- 
duced by Blumensath [2]. These structures enjoy decidable first-order theories due to the 
good closure properties of finite automata. Since the translation from collapsible pushdown 
systems into tree-automata presentations of the generated graphs is uniform, our result im- 
plies that first-order model-checking on collapsible pushdown graphs of level 2 is decidable: 
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given a pushdown system, first compute the tree-automata representing its graph, then 
apply classical model-checking for tree- automatic structures. 

Moreover, the result still holds if regular reachability predicates are added to the graphs. 

1.1. Main Result. 

Theorem 1.1. Let S be a collapsible pushdown system of level 2 with configuration graph 
&. Let &/e be the e-contraction of <5. Any expansion of<5/e by regular reachability relations 
is tree- automatic^ 

A regular reachability relation is of the form Reach ^ for some regular language L. For 
nodes a,b of some graph & with labelled edges, (5 \= Reach/, (a, b) if there is a path from 
a to b which is labelled by some word w 6 L. The translation from collapsible pushdown 
systems to tree-automatic presentations is uniform, i.e., there is a uniform way of computing, 
given a collapsible pushdown system (and finite automata representing regular languages 
over the edge-alphabet of the system), the tree-automata presentation of the e-contraction 
of the generated graph (expanded by the regular reachability predicates). Once we have 
obtained tree-automata representing some graph, first-order model-checking on this graph 
is decidable. Combining these results we obtain the following corollary. 

Corollary 1.2. The following problem is decidable: 

Input: a collapsible pushdown system S (of level 2 ), finite automata A\ , . . . , A n representing 
regular languages L\, . . . , L n , and a formula <p in first-order logic extended by the relations 
Reach^j , . . . , Reach r, n 

Output: <5/e \= (p? (&/e denotes the e-contraction of the graph generated by S.) 

We also show that the decision procedure is necessarily nonelementary in the size of 
the formula. 

1.2. Outline of the Paper. Sections 12.11 and 12.21 introduce basic notation. In Section [2.31 
we introduce collapsible pushdown graphs (of level 2) and we show basic properties of these 
graphs. We recall the notion of tree-automaticity in Section [2.41 We present our encoding 
of configurations of collapsible pushdown graphs as trees in Sectional We also show that, 
once we have proved that regular reachability is tree-automatic via this encoding, collapsible 
pushdown graphs are tree- automatic. The final part of that Section discusses the optimality 
of the first-order model-checking algorithm obtained from this tree-automata approach. 
Sections H] and [5] complete the proof by showing that regular reachability is actually tree- 
automatic via our encoding. In Section [3] we develop the technical machinery for proving 
the regularity of the reachability relation. We analyse how arbitrary runs of collapsible 
pushdown systems decompose as sequences of simpler runs. Afterwards, in Section [5] we 
apply these results and show that the encoding turns the regular reachability relations into 
tree-automatic relations. Finally, Section [6] contains concluding remarks. 

2. Preliminaries and Basic Definitions 
For a function /, we denote by dom(/) its domain. 



Due to Broadbent et al. [4], the result still holds if we expand the graph by Lji definable predicates. 
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2.1. Logics. We denote by FO first-order logic. Given some graph © = (V, E\,E2, . . . , -E n ) 
with labelled edge relations Ei,...,E n C F x F we denote by Reach the reachability 
predicate, defined by 

Reach := | (a, b) : there is a (|^J E^-path from a to &| . 

Given some regular language L, we denote by 

Reach/, := |(a, 6) : there is a (|^J E^-path from a to 6 labelled by some word w G l| 

the reachability predicate with respect to L. 



2.2. Words and Trees. For words wi,W2 G X* we write iui < W2 if ioi is a prefix of t/^- 
w/i n W2 denotes the greatest common prefix of w\ and W2- The concatenation of w\ and 
W2 is denoted by w\W2- 

We call a finite set D C {0, 1}* a iree domain, if D is prefix closed. A ^-labelled tree is 
a mapping T : D — > £ for D some tree domain. For d G D we denote the subtree rooted at 
d by T^. This is the tree defined by T^(e) := T(de). We will usually write d ET instead of 
<i € dom(T). We denote the depth of the tree T by dpt(T) := max{|t| : t G dom(T)}. 

For T some tree with domain D, let D + denote the set of minimal elements of the 
complement of D, i.e., 

D + = {e G {0, 1}* \ D : all proper ancestors of e are contained in D}. 

We write D® for DL)D + . Note that D® is the extension of the tree domain D by one layer. 

Sometimes it is useful to define trees inductively by describing the subtrees rooted at 
and 1. For this purpose we fix the following notation. Let To and T\ be S-labelled trees 
and a G S. Then we write T := a (Tq;TiJ for the S-labelled tree T with the following 
three properties 

1. T(e) = a, 2. T = f , and 3. T 1 = f 1 . 

We denote by Ts the set of all S-labelled trees. 



2.3. Collapsible Pushdown Graphs. Before we introduce collapsible pushdown graphs 
(CPG) in detail, we fix some notation. Then we informally explain collapsible pushdown 
systems. Afterwards we formally introduce these systems and the graphs generated by 
them. We conclude this section with some basic results on runs of collapsible pushdown 
systems. 

We set S+ 2 := and £* 2 := (£*)*. Each element of S* 2 is called a 2-word. Stacks 

of a collapsible pushdown system are certain 2-words from S +2 over a special alphabet. In 
analogy, we will call words also 1-words. 

Let us fix a 2-word s G X* 2 . s consists of an ordered list w±,W2, ■ ■ ■ ,w m of words. If 
we want to state this list of words explicitly, we write s = w± : W2 ■ • • • ■ w m . Let |s| := m 
denote the width of s. The height of s is hgt(s) := max{|w;j| : 1 < i < m} which is the 
length of the longest word occurring in s. 

Let s' be another 2-word with s' = w[ : w' 2 '■ • • • '■ w[- We write s : s' for the concatena- 
tion wi : W2 ■ ■ ■ ■ ■ w m : w[ : w' 2 ■ ■ ■ ■ ■ w[. 
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For some word w, let [w] be the 2-word that only consists of w. We regularly omit the 
brackets if no confusion arises. 

A level 2 stack s over some alphabet S is a 2-word over S where each letter additionally 
carries a link to some i-word for 1 < i < 2. We call i the level of the link. The idea is 
that the linked i-word contains some information about what the stack looked like when 
the letter was created. 

We first define the initial level 2 stack; afterwards we describe the stack operations that 
are used to generate all level 2 stacks from the initial one. 

Definition 2.1. Let S be some finite alphabet with a distinguished bottom-of-stack symbol 
_L € S. The initial stack of level 1 is the word J_i := _L The initial stack of level 2 is the 
2-word _L 2 := 

We informally describe the stack operations that can be applied to a level 2 stack. 

• The push operation of level 1, denoted by push CT)fc for a G £ and 1 < k < 2, writes the 
symbol a onto the topmost level 1 stack and attaches a link of level k. This link points 
to a copy of the topmost fc-word of the resulting stack without the topmost k — 1 stack. 
For k = 2 this means that the link points to the current stack where the topmost word 
is removed. For k = 1 the link points to the topmost word of the stack before the push 
operation was performed. 

• The push operation of level 2 is denoted by clone2- It duplicates the topmost word. Since 
this also copies the values of the links stored in the topmost word, the copy of each symbol 
in the newly created word still contains information what did the stack look like when 
the corresponding original symbol was pushed onto the stack. 

• The level i pop operation popj for 1 < i < 2 removes the topmost entry of the topmost 
z-word. 

• The last operation is collapse. The result of collapse is determined by the link attached 
to the topmost letter of the stack. If we apply collapse to a stack s where the link level 
of the topmost letter is i, then collapse replaces the topmost level i stack of s by the 
level i stack to which the link points. Due to how push operations create these links, the 
application of a collapse is equivalent to the application of a sequence of popj operations 
where the link of the topmost letter controls how long this sequence is. 

In the following, we formally introduce collapsible pushdown stacks and the stack op- 
erations. We represent such a stack of letters with links as 2-words over the alphabet 
(S U (X x {2} x N)) +2 . We consider elements from £ as elements with a link of level 1 and 
elements (a, 2, k) as letters with a link of level 2. In the latter case, the third component 
specifies the width of the substack to which the link points. For letters with link of level 1, 
the position of this letter within the stack already determines the stack to which the link 
points. Thus, we need not explicitly specify the link in this case. 

Remark 2.2. Other equivalent definitions, for instance in |10| . use a different way of storing 
the links: they also store symbols (a,i,n) on the stack, but here n denotes the number of 
popj transitions that are equivalent to performing the collapse operation at a stack with 
topmost element (a,i,n). The disadvantage of that approach is that the clones operation 
cannot copy stacks. Instead, it can only copy the symbols stored in the topmost stack and 
has to alter the links in the new copy. A clone of level i must replace all links (a, i,n) by 
(cr,i, n + 1) in order to preserve the links stored in the stack. 

We introduce some auxiliary functions which are useful for defining the stack operations. 
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Definition 2.3. For s = w\ : W2 : • • • : w n G (E U (E x {2} x N)) +2 , and ro„ = aia2 . . . a m 
we define the following auxiliary functions: 

• top 2 (s) := w n and top 1 (s) := a m . 

• The topmost symbol is Sym(s) := a for top 1 (s) = a G E or top 1 (s) = (cr, 2, A;) G E x {2} x 
N. 

1 if to Pl (s) G E, 



• The collapse level of the topmost element is CLvl(s) : 

• Set CLnk(s) := 



2 otherwise. 



j if to Pl (a) G E x {2} x {j}, 

m — 1 if top 1 (s) G E. 
CLnk(s) is called the collapse link of the topmost element. 
Set p CT ,2,fc(wm) := w m (a,2,k) and p CT ,2,fc(s) := w 1 : w 2 : • 
jfe G N. 



: P<7,2,fe(w„) for all 



These auxiliary function are useful for the formalisation of the stack operations. 

Definition 2.4. For s = w\ : W2 : • • • : w n G (E U (E x {2} x N)) +2 and w n = a\02 ■ ■ ■ a m , 
for a G E \ {±} and for 1 < k < 2, we define the stack operations 

clone 2 (s) :=wi : w 2 : ■ ■ ■ : if n -i : i«n : ^n, 




: 2 : • • 

Po-,fe,n-l( s ) 



w n -i : w n a if A; = 1, 
if k = 2, 



collapse (s) := < 



w\ : W2 '■ ■ ■ ■ ■ w n -\ if k = 2, n > 1, 

: u> 2 : • • • : w n -i : [aia 2 . . . o m _i] if A; = 1, m > 1, 
undefined otherwise, 

toi : ty 2 : • • • : w k if CLvl(s) = 2, CLnk(s) = /c > 0, 
P°Pi( s ) if CLvl(s) = 1, m > 1, 

undefined otherwise. 

The set o/ /ewe/ 2 operations is 

OP := {(push (T fc ) (TeSi i< fc < 2 ,clone 2 , (pop fe )i< fc < 2 , collapse}. 

The set of (level 2) stacks Stck(E) is the smallest set that contains _L 2 and is closed under 
application of operations from OP. 

Note that we defined collapse and pop fc in such a way that the the resulting stack is 
always nonempty and does not contain empty words. This avoids the special treatment of 
empty stacks. Note that there is no clonei operation. Thus, any collapse that works on level 
1 is equivalent to one pop 1 operation because level 1 links always point to the preceding 
letter. Every collapse that works on level 2 is equivalent to a sequence of pop 2 operations. 
Next, we introduce the substack relation. 

Definition 2.5. Let s,s' G Stck(E). We say that s' is a substack of s (denoted as s' < s) 
if there are n, m G N such that s' = pop 1 n (pop 2 m (s)). 

Having concluded the definitions concerning stacks and stack operations, it is time to 
introduce collapsible pushdown systems. 
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Definition 2.6. A collapsible pushdown system of level 2 (CPS) is a tuple 

«S = (Q,E,r,A,g ) 

where Q is a finite set of states, S a finite stack alphabet with a distinguished bottom-of- 
stack symbol 1 g E, T a finite input alphabet, qo £ Q the initial state, and 

ACQxSxTxQxOP 

the transition relation. 

Every (q,s) £ Cnf (Q, E) := Q x Stck(S) is called a configuration and Cnf(Q,£) is 
called the set of configurations^ We define 7-labelled transitions h 7 C Cnf x Cnf as follows: 
(qi,s) h 7 (q2,t) if there is a (gi, a, 7, (72, op) £ A such that op(s) = i and Sym(s) = a. 

We call b:= U 7 er ^ 7 * ne transition relation of 5. We set C(<S) to be the set of 
all configurations that are reachable from (go? J-2) via K These configurations are called 
reachable. The collapsible pushdown graph (CPG) generated by S is 

CPG(S) := (C(5),(C(5) 2 nh 7 ) 7er ). 

Example 2.7. The following example (Figure [T]) of a collapsible pushdown graph (5 of 
level 2 is taken from [lOj. Let Q := {0,1,2},S := {±,a}, T := {CI, A, A', P, Co}. A is 
given by (0, -, CI, 1, clone 2 ), (1, -, A, 0, push a 2 ), (1, -, A', 2, push aj2 ), (2, a, P, 2, pop^, and 
(2, a, Co, 0, collapse), where — denotes any letter from S. 




: _La : _Laa : _Laaa 
|P 

2_L : _La : ±aa : _Laa 
{P 

2_L : Xa : _Laa : _La 
|p 

2_L : ±a : Xaa : _L 

Figure 1: Example of the 2-CPG & (the level 2 links of the letters a are omitted in the 
representation of the stacks) . 



Hague et al. [IOJ already noted that the previous example has undecidable MSO theory 
because the half grid {(n,m) £ N 2 : n > m} is MSO interpretable in this graph (note that 
the collapse edges of two vertices point to the same target if and only if the vertices are on 
the same diagonal of the grid). 

Next we define the e-contraction of a given collapsible pushdown graph. From now on, 
we always assume that the input alphabet T contains the symbol e. 

Definition 2.8. Let T be some alphabet. Let L and L 7 be the regular languages de- 
fined by the expressions ({e}*(r \ {e}))* and {e}*7, respectively. Given a collapsible 
pushdown graph (3, the e-contraction &/e of <5 is the graph (M, (Reacli£, 7 ) 7gr \ e ) where 
M := {g € : © (= Reach L ((q , ± 2 ), g)}. 



'We write Cnf instead of Cnf(Q, E) if Q and E are clear from the context. 
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Remark 2.9. This is the usual definition of e-contraction. An edge in the new graph 
consists of a sequence of e-edges followed by one non-e-edge. The set of configurations is 
then restricted to those configurations that are reachable via the new edges from the initial 
configuration. 

Now we come to the notion of a run of a collapsible pushdown system. 

Definition 2.10. Let S be a collapsible pushdown system. A run p of S is a sequence of 
configurations that are connected by transitions, i.e., a sequence 

C h 71 Ci h 72 C 2 h 73 • • • h 7 " C n . 

We denote by p(i) := Ci the i-th configuration of p. Moreover, we denote by length(p) := n 
the length of p. For < i < j < length(p), we write p\uj] for the subrun 

d KN+ 1 c i+ i h^+ 2 . . . K« Cj. 

We write Runs(c, c') for the set of runs starting at c and ending in c'. For s,s' stacks, we 
also write Runs(s, s') := \J ql& q Runs((g, s), (q' , s')). 

Consider some configuration (g, s) of a CPS. If |s| = n then a push^ 2 transition applied 
to (q, s) creates a letter with a link to the substack of width n—1. Thus, links to the substack 
of width n — 1 in some word above the n-th one are always created by a clone2 operation. 
A direct consequence of this fact is the following lemma. 

Lemma 2.11. Let s be some level 2 stack with top 1 (s) = (er, 2, k). Let p G Runs(s,s) be a 
run that passes pop 1 (s). If k < \s\ — 1 then p passes pop 2 (s). 

The proof is left to the reader. Later we often use the contraposition: if CLvl(s) = 2 
and CLnk(s) < |s| — 1, then any run not passing pop 2 (s) does not pass pop 1 (s). 

Following the ideas of Blumensath [3] for higher-order pushdown systems, we introduce 
a prefix replacement for collapsible pushdown systems. This replacement allows to copy 
runs starting in one configuration into a run starting at another configuration. 

Definition 2.12. For t € Stck(S) and some substack s < t we say that s is a prefix of 
t and write s < t, if there are n < m £ N such that s = W\ : wi : • • • : w n -\ : w n and 
t = wx : u>2 ■ ■ ■ : w n -i : v n : v n+ \ : ■ ■ ■ : v m such that w n < Vj for all n < j < m. 

For a configuration c = (q, t), we write s^casan abbreviation for s <j t. For some run 
p, we write s < p if s < for all i G dom(p). 

Definition 2.13. Let s,t,u be level 2 stacks such that s<t. Assume that 

s = wi : w 2 : • • • : w n -\ : w n , 

t = wi : w 2 ■ ■ ■ : w n -i : v n : v n+ i : ■ ■ ■ : v m , and 

it = xi : x 2 : • • • : x„_i : x„ 

for numbers n, m G N such that n < m. For each n < i < m, let t)j be the unique word 
such that Vi = w n Vi. We define 

t[s/u] := xi : x 2 : • • • : x n _ x ■ {x n v n ) : (x n v n+1 ) : • • • : (x n v m ) 

and call t[s/u] the stack obtained from t by replacing the prefix s by u. 
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Remark 2.14. Note that for t some stack with level 2 links, the resulting object t[s/it] 
may be no stack. Take for example the stacks 

s = -L(a,2,0) : _L, 

t = J_(a,2,0) : ±(o,2,0) and 

u = 1 : _L. 

Then t[s/u] = _L : _L(a, 2, 0). This list of words cannot be created from the initial stack 
using the stack operation because an element (a, 2, 0) in the second word has to be a clone 
of some element in the first one. But (a, 2, 0) does not occur in the first word. Note that 
t[s/u] is always a stack if s = pop 2 fc (t) for some k G N. 

Lemma 2.15. Let p be a run of some collapsible pushdown system S of level 2 and let s 
and u be stacks such that the following conditions are satisfied: 

(1) s<P, 

(2) top^u) = top x (s), 

(3) \s\ = \u\, and 

(4) for p(0) = (q,t), t[s/u] is a stack. 

Under these conditions the function p[s/u] defined by p[s/u](i) := p(i)[s/u] is a run of S. 

Proof (sketch). The proof is by induction on the length of p. It is tedious but straightforward 
to prove that p(i)[s/u] and p(i) share the same topmost element. Thus, the transition 6 
connecting p(i) with p(i + 1) is also applicable to p(i)[s/u]. By case distinction on the stack 
operation one concludes that 5 connects p(i)[s/u] with p(i + l)[s/u]. □ 

2.4. Finite Automata and Automatic Structures. In this section, we present the 
basic theory of finite bottom-up tree-automata and tree- automatic structures. For a more 
detailed introduction, we refer the reader to [8]. 

Definition 2.16. A (finite tree-) automaton is a tuple A = (Q, S, qi, F, A) where Q is a 
finite nonempty set of states, S is a finite alphabet, qi € Q is the initial state, F C Q is the 
set of final states, and ACQxSxQxQis the transition relation. 

We next define the concept of a run of an automaton on a tree. 

Definition 2.17. A run of A on a binary S-labelled tree t is a map p : dom(t)® — > Q such 
that 

• p{d) = qj for all d £ dom(i) + , and 

• (p(d),t(d),p(dO),p(dl)) G A for all d G dom(f). 

p is accepting if p(e) G F. We say t is accepted by A if there is an accepting run of A on t. 
With each automaton A, we associate the language 

L(A) := {t : t is accepted by .4.} 

recognised (or accepted) by A The class of languages accepted by automata is called the 
class of regular languages. 

Automata can be used to represent infinite structures. Such representations have good 
computational behaviour. In the following we recall the definitions and important results 
on tree-automatic structures. 
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We first introduce the convolution of trees. This is a tool for representing an n-tuple of 
S-trees as a single tree over the alphabet (EU {□})" where □ is a padding symbol satisfying 
□ £ E. 

Definition 2.18. The convolution of two E-labelled trees t and s is given by a function 

t <g> s : dom(t) U dom(s) ->(SU {D}) 2 
where □ is some new padding symbol, and 

' (t(d),s(d)) if d G dom(i) ndom(s), 
(t 8) s)(d) := < □) if d G dom(t) \ dom(s), 
k (D,s(d)) if d G dom(s) \dom(i). 

We also use the notation ®(ti, t^, . . . ,t n ) for t± ® £2 8) • • • <8> i n - 

Using convolutions of trees we can use a single automaton for defining n-ary relations 
on a set of trees. Thus, we can then use automata to represent a set and a tuple of n-ary 
relations on this set. If we can represent the domain of some structure and all its relations 
by automata, we call the structure automatic. 

Definition 2.19. We say a relation R C Ts™ is automatic if there is an automaton A such 
that L(A) = {®(h,t 2 ,...,t n ) GV : {t u t 2 ,...,t n ) £R}. 

A structure 03 = (B, E\, E 2 , ■ ■ ■ , E n ) with relations Ei is automatic if there are automata 
Ab, Ae 1 ,Ae 2 i ■ ■ ■ j Ae„ and a bijection / : L(^4b) — > i? such that for ci, C2, • • • , c n G L(^4^), 
the automaton Aei accepts (^)(ci, C2, . . . , c n ) if and only if (/(ci), /(C2), . . . , f(c n )) G £^j. 

In other words, / is a bijection between L(.Ab) and -B and the automata AE t witness 
that the relations Ei are automatic via /. We call / a tree presentation of !B. 

Automatic structures form a nice class because automata theoretic techniques may be 
used to decide first-order formulas on these structures: 

Theorem 2.20 (0, pi], [II]). //33 is automatic, then its FO(3 mod , Ram)- theory is decid- 



3. Collapsible Pushdown Graphs are Tree- Automatic 

In Section \3. II we present a bijection Enc between Cnf and a regular set of trees. Moreover, 
Enc translates the reachability predicates Reach^ C Cnf x Cnf for each regular language 
L into a tree-automatic relation. The proof of this claim which is the technical core of this 
paper is developed in Sections H] and Before we present this proof, we show in Section 13.21 
how this result can be used to prove our main theorem. Moreover, in Section [3731 we discuss 
the optimality of the first-order model-checking algorithm derived from this construction. 

Regularity of the regular reachability predicates implies that Encf dom (cpG(S)/e) is an 
automatic presentation of CPG(5)/e because its domain and its transition relation can be 
defined as reachability relations Reach^ for certain regular languages L. Note that for the 
definition of the domain, we need the encoding of the initial configuration as parameter. 
This parameter can be hard-coded because its encoding is a fixed tree. 



FO(3 mod , Ram) is the extension of FO by modulo counting quantifiers and by Ramsey-Quantifiers. 
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Figure 2: A stack with blocks forming a c-blockline. 

3.1. Encoding of Level 2 Stacks in Trees. In this section we present an encoding of 
level 2 stacks in trees. The idea is to divide a stack into blocks and to encode different blocks 
in different subtrees. The crucial observation is that every stack is a list of words that share 
the same first letter. A block is a maximal list of words occurring in the stack which share 
the same two first letters. If we remove the first letter of every word of such a block, the 
resulting 2-word decomposes again as a list of blocks. Thus, we can inductively carry on to 
decompose parts of a stack into blocks and encode every block in a different subtree. The 
roots of these subtrees are labelled with the first letter of the block. This results in a tree 
where every initial left-closed path in the tree represents one word of the stack. A path of 
a tree is left-closed if its last element has no left successor (i.e., no O-successor) . 

The following notation is useful for the formal definition of blocks. Let w G X* be some 
word and s = w\ : ui2 ■ • • • : w n G X* 2 some 2-word. We write s' := w \ s for s' = ww\ : 
WW2 : • • • : ww n . Note that [w] <(w\s), i.e., [w] is a prefix of s'. We say that s' is s prefixed 
by w. 

Definition 3.1. Let a G X and b G X +2 . We call b a a-block if b = [a] or b = ar \ s' for 

some t G S and some s' G X* 2 . If b%, 62 > ■ ■ ■ > b n are cr-blocks, then we call b\ : b2 b n a, 

a -blockline. 

Note that every stack in Stck(S) forms a ±-blockline. Furthermore, every blockline / 
decomposes uniquely as I = b\ : 62 : • • • : b n of maximal blocks 6j in I. 

Another crucial observation is that a a-block b G X* 2 \ X decomposes as b = a \ I for 
some blockline / and we call I the blockline induced by b. For a block of the form [b] with 
b G X, we define the blockline induced by [b] to be e. 

Definition 3.2. Let / be a <r-blockline such that / = 61 : 62 : • • • : b n is its decomposition 
into maximal blocks. Let i\, 12, ■ ■ ■ , i m be those indices such that for all 1 < j < n we have 
bj 7^ [a] if and only if j = for some 1 < k < m. For 1 < k < m, let b\ k be the 2-word such 
that bi k = a \ b'; L . We recursively define the blocks of I to be the minimal set containing 
bi,b%, . . . ,b n and the blocks of each of the bi k (1 < k < m) seen as r-blockline for some 
letter r. 

See Figure [2] for an example of a stack with one of its blocklines. 

Recall that the symbols of a collapsible pushdown stack (of level 2) come from the set 
X U (X x {2} x N) where X is the stack alphabet. For r G X U (X x {2} x N), we encode a 
r-blockline / in a tree as follows. The root of the tree is labelled by (Sym(r), CLvl(r)). The 
blockline induced by the first maximal block of / is encoded in the left subtree and the rest 
of I is encoded in the right subtree. This means that we only encode explicitly the symbol 
and the collapse level of each element of the stack, but not the collapse link. We will later 
see how to decode the collapse links from the encoding of a stack. When we encode a part of 
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(c,2, 1) e c,2 e,l 

(6,2,0) (6,2,0) c (d,2,3) 6 2-1 cl A 

(a, 2,0) (a, 2,0) (a, 2, 2) (a, 2, 2) (a, 2, 2) \ \ j 

_L _L _L _L _L a, 2 a, 2 — e — e 

_L,1 *-e 

Figure 3: A stack s and its Encoding Enc(s): right arrows lead to 1-successors (right suc- 
cessors), upward arrows lead to 0-successors (left successors). 



Enc(s, a) 



a blockline in the right subtree, we do not repeat the label (Sym(r), CLvl(r)), but replace 
it by the empty word e. 

Definition 3.3. Let r E E U (E x {2} x N). Furthermore, let 

s = wi : w 2 : • • • : w n G (E U (S x {2} x N))+ 2 

be some r-blockline. Let -«4 be a word for each 1 < i < n such that s = r \ [w[ : w' 2 : • • • : 
and set s' := uj^ : w 2 : • • • : w' n . As an abbreviation we write jS^ := wi : u>i+i : • • • : w^. Let 
iSj be a maximal block of s. Note that j ' > 1 implies that there is some r' € SU(Sx {2} x N) 
and there are words w'L for each f < j such that Wai = tt'w'L. 

For arbitrary a € (S x {1, 2}) U {e}, we define recursively the (E x {1, 2}) U {ejdabelled 
tree Enc(s, a) via 

a if \w\ | = 1, n = 1 

a (0; Enc(2S n ,e)) if |iwi| = 1, n > 1 

a(Enc( 1 <,(Sym(r , ),CLvl(r , )));0} if |^i| > l,j = n 

a ^Enc(is^-, (Sym(r'), CLvl(r'))); Enc(j + is n , e)\ otherwise 

For every s € Stck(E), Enc(s) := Enc(s, (_L, 1)) is called the encoding of the stack s. 

Figure [3] shows a configuration and its encoding. 

Remark 3.4. Fix some stack s. For a € E and k £ N, every (a, 2, /c)-block of s is encoded in 
a subtree whose root d is labelled (a, 2). We can restore k from the position of d £ {0, 1}*0 
in the tree Enc(s) as follows. 

k = \{d! G Enc(s) n {0, 1}*1 : d! <i ox d}\, 

where <i ex is the lexicographic order. This is due to the fact that every right-successor 
corresponds to the separation of some block from some other. 

This correspondence can be seen as a bijection. Let s = w\ : w<i : • • • : w n be some 
stack. We define the set R := dom(Enc(s)) n ({e} U {0, 1}*1). Then there is a bijection 
/ : {1,2,3, . .. ,n} — > R such that i is mapped to the i-th element of R in lexicographic 
order. Each 1 < i < n represents the i-ih word of s. f maps the first word of s to the root 
of Enc(s) and every other word in s to the element of Enc(s) that separates this word from 
its left neighbour in s. 

If we interpret e as empty word, the word from the root to f(i) in Enc(s) is the 
greatest common prefix of lOj-i and u>j. More precisely, the word read along this path is 
the projection onto the letters and collapse levels of Wi-i l~l tUj, 
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Furthermore, set f'(i) '■= f(i)O m G Enc(s) such that m is maximal with this property, 
i.e., f'(i) is the leftmost descendant of f(i). Then the path from f(i) to fit) is the suffix 
w[ such that W{ = PiWi)w' i (here we set wq := e). More precisely, the word read along 

this path is the projection onto the symbols and collapse levels of w[. 

Having defined the encoding of a stack, we want to encode whole configurations, i.e., a 
stack together with a state. To this end, we just add the state as a new root of the tree and 
attach the encoding of the stack as left subtree, i.e., for some configuration (q, s) we set 

Enc(g,s) := g(Enc(s);0) . 

The image of this encoding function contains only trees of a very specific type. We call 
this class T Enc . In the next definition we state the characterising properties of T Enc . 

Definition 3.5. Let T Enc be the class of trees T that satisfy the following conditions. 

(1) The root of T is labelled by some element of Q (T(e) € Q). 

(2) I dom(T), G dom(T). 

(3) T(0) = (X,1). 

(4) Every element of the form 0{0, 1}*0 is labelled by some (a, I) G (E \ {_L}) x {1, 2}, 

(5) Every element of the form {0, 1}*1 is labelled by e. 

(6) There is no t G T such that T(t0) = (a, 1) and T(tlO) = (a, 1). 

Remark 3.6. Note that all trees in the image of Enc satisfy condition [6] due to the following. 
T(t0) = T(tlO) = (a, 1) would imply that the subtree rooted at t encodes a blockline I such 
that the first block b\ of I induces a cr-blockline and the second block 62 induces also a 
(T-blockline. This contradicts the maximality of the blocks used in the encoding because 
all words of b\ and 62 have a as second letter whence b\ : 62 forms a larger block. Note 
that for letters with links of level 2 the analogous restriction does not hold. In Figure [3] one 
sees the encoding of a stack s where Enc(s)(0) = Enc(s)(10) = (a, 2). Here, the label (o, 2) 
represents two different letters. Enc(s)(0) encodes the element (a, 2,0), while Enc(s)(10) 
encodes the element (a, 2, 2), i.e., the first element encodes a letter a with undefined link 
and the second encodes the letter a with a link to the substack of width 2. 

Lemma 3.7. There is a finite automaton „4. T E nc with 2 + 3|S| many states that recognises 

IpEnc 

Proof. Set AjEnc := (Q^,(5U(S x {1, 2}) U {e}, _L, {qi}, A^) where Q_a and A.4 are defined 
as follows. Let Qa := {-L,qj} U(Sx {1,2}) U {P a : a G £}. The states of the form (a, i) 
are used to guess that a node of the tree is labelled by (<r, i) while the states P a are used to 
prohibit that the left successor of a node is labelled by (a, 1) (P± is used if no restriction 
applies). The transitions ensure that whenever we guess that dO is labelled by (a, 1) then 
all is reached in state P a ensuring that dlO cannot be labelled by (a, 1). For the definition 
of A_4 we use the following conventions, q ranges over Q, i,j range over {1,2}, a over E, 
r over E \ {!.} and over E \ {J_, a} whenever a is fixed. Set A_4 := {(qi, q, (-L, 1), J_), 
((a,i),(a,i),(r,l),P T ), ((a, i), {a, i), (r, 2), P ± ), ((a,i),(a,i),{r,j),±), ((a,i), (a,i), JL, PjJ, 
((a,i),(a,i),±,±), (P a ,e,(Tf,l),P T ), (P a ,e,(r,2),P ± ), (P a , s, ( V , 1), 1), (P ff) e, (r, 2), ±), 
{P a ,e,±,P ± ),(P a ,e,±,±)}. □ 

Lemma 3.8. Enc : Q x Stck(E) — > T Enc is a bisection. We denote its inverse by Dec. 

The proof of this lemma is tedious. It can be found in Appendix [XI 
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3.2. Tree-automaticity of Collapsible Pushdown Graphs. Our main technical con- 
tribution in this paper is stated in the next proposition. It concerns the regularity of the 
regular reachability predicates Reach/, with respect to our encoding of configurations. We 
postpone the proof of this proposition to Section [5j 

Proposition 3.9. There are polynomials p% and p2 such that the following holds. Let 
S = (Q, S, r, qo, A) be some collapsible pushdown system of level 2 and let L be some 
regular language over T recognised by some nondeterministic finite automaton with state set 
P. Reach/, is tree- automatic via Enc and there is a nondeterministic finite tree- automaton 
with pi ( | £ j ) • exp(p2(\Q\ ■ \P\)) many states recognising Reach/, in this encoding. 

Remark 3.10. In the proposition, Reach/, has to be understood with respect to all possible 
configurations of a level 2 collapsible pushdown system as opposed to those occurring in 
the configuration graph of S, i.e., those reachable via the transitions of S from the initial 
configuration of <S. 

We obtain the automaticity of the e-contractions of all level 2 collapsible pushdown 
graphs as a direct corollary of the previous result. 

Corollary 3.11. There are polynomials p and q such that the following holds. Given a 
CPS S = (Q, E, r, qo, A), the e-contraction CPG(<S)/e is regular via Enc. Moreover, there 
is a presentation such that each automaton in the presentation o/CPG(<S)/e has at most 
p(|S|) • exp(q(\Q\)) many states. 

Proof. The domain of CPG(5)/e is {c : CPG(S) |= Reach (r * (r \ {£})) * ((q , _L 2 ), c)}. Note 
that ({e}* (r \ {e}))* is accepted by an automaton with 2 states. Furthermore, hard-coding 
Enc(go ; J-2) as first argument to the automaton from Proposition 13. 91 increases the number of 
states by a at most a factor 3 (because dom(Enc(go, J-2)) = {e> 0}). Thus, the corresponding 
automaton has 3 • exp(j>2(2 • \Q\)) many states where p\ and P2 are the polynomials 

from Proposition 13.91 

Similarly, h 7 in CPG(5)/e is exactly the same as Reach{ £ }* 7 . Again 2 states suffice to 
recognise {e}*7- □ 

3.3. Lower Bound for FO Model-Checking. Since CPG are tree-automatic, their FO 
model-checking problem is decidable. The algorithm obtained this way has nonelementary 
complexity. In this section we prove that we cannot do better: there is a fixed collapsible 
pushdown graph of level 2 whose FO theory has nonelementary complexity. We present a 
reduction of FO model-checking on the full infinite binary tree to FO model-checking on this 
collapsible pushdown graph. Recall that FO model-checking on the full infinite binary tree 
% := (T, <, Si, S2) with prefix order ^ and successor relations S±,S2 has a nonelementary 
lower bound (cf. Example 8.3 in [9])- 

Theorem 3.12. The expression complexity of any FO model- checking algorithm for level 
1 collapsible pushdown graphs is nonelementary. 

Remark 3.13. Note that this is a statement about plain collapsible pushdown graphs and 
not about the e-contractions. In contrast to the theorem, the first-order model-checking 
problem on non-e-contracted level 1 pushdown graphs is complete for alternating exponen- 
tial time [17] , 
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Proof. We modify the CPS of Example 12.71 We add the transition (2, a, P 2 , pop 2 , 0). Note 
that the ordinal (u, :<) is first order definable in this graph: restrict the domain to all 
elements with state 0. The order ^ is then defined via ip^(x,y) := 3z z h P2 y A z h Co x. 

Now, we obtain the binary tree (T, ^) by use of the stack alphabet {±,a, 6}. For each 
occurrence of a in a transition 5, we make a copy of 5 where we replace a by b. Then, 
each configuration c with state is determined by top 2 (c) and these are in bijection to the 
set {a,b}*. Furthermore, ip-< defines the prefix relation on this set. Thus, (T, ^, S\, S2) is 
FO-interpretable in this graph whence its FO theory has nonelementary complexity. fj 

4. Decomposition of Runs 

In this section we develop the technical background for the proof that the regular reacha- 
bility predicates are tree-automatic via Enc. We investigate the structure of runs of CPS. 
We prove that any run is composed from subruns which can be classified as returns, loops, 
or 1-loops. Forgetting about technical details, one can say that returns are runs from some 
stack s to pop 2 (s), loops are runs that start and end in the same stack and 1-loops are runs 
from some stack s to a stack s' such that s and s' share the same topmost word and s is a 
substack of s'. Every run decomposes as a sequence of the form Ao ° pi ° Ai o p 2 o • • • o p n o A n 
where the pi only perform one operation each and the A« are returns, loops or 1-loops of 
maximal length in a certain sense. Let us explain this idea precisely in the case of a run 
p from some stack s± to a stack s 2 such that si = pop 2 fc (s 2 ). In this special case, the Aj 
are all loops and the sequence of operations induced by p\ , . . . , p n is a sequence of minimal 
length transforming s± into s 2 . This sequence of minimal length is in fact unique up to 
replacement of pop 1 and collapse operations of level 1. As a direct consequence, the loops 
Aj occurring in the decomposition cover the largest possible part of p in terms of loops, 
returns and 1-loops. This is also the key to understanding our decomposition result for 
general runs: we identify maximal subruns of an arbitrary run which are returns, loops and 
1-loops and we prove that the parts not contained in one of these subruns form a short 
sequence of operations. 

Hence, understanding the existence of returns, loops and 1-loops allows to clarify 
whether runs between certain configurations exist. It turns out that our decomposition 
is very suitable for the analysis with finite automata because such automata can be used to 
decide whether returns, loops and 1-loops starting in a given stack exist. 

We next start with a general decomposition of any run into four parts. Afterwards we 
prove decomposition results for each of the parts where returns, loops and 1-loops are the 
central pieces of the decomposition. Finally, we show how finite automata acting on the 
topmost word of a stack can be used to compute the existence of returns, loops and 1-loops 
starting at this stack. 

4.1. Decomposition of General Runs. We introduce a decomposition of an arbitrary 
run p into four parts. The idea is that every run from a stack si to a stack s 2 passes a 
minimal common substack t of si and s 2 . Any run from si to t decomposes into a first part 
from si to a stack of the form t± := pop 2 fc (si) such that \t\\ = \t\ and a second part from 
t\ to t. Similarly, for the unique stack t 2 := pop 2 J (s 2 ) such that |i 2 | = |*| the run from t 
to s 2 decomposes into a run from t to t 2 and a run from i 2 to s 2 . In the following sections 
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we prove that every part of this decomposition again decomposes into returns, loops, and 
1-loops. 

Lemma 4.1. Let c% = (qi,s\) and C2 = (<Z2> S 2) be configurations and p G Runs(ci, C2). Let 
t be the minimal substack of s± such that p visits t. Furthermore, let mi := pop 2 ' Sl '~'*'(si) 
and i7i2 '■= pop 2 ' S2 '~'*'(s2)- P decomposes as p = p\ o pi o p 3 o p^ where 

• p\ G Runs(si,?7ii) does not visit any substack of mi before its final configuration, 

• P2 G Runs(mi,£) does not visit any substack of t before its final configuration, 

• P3 G Runs(t, 771-2) does not visit a substack o/pop 1 (t), and 

• P4 G Runs(m2, §2) does not visit any substack of rri2 after its initial configuration. 

Proof. Let 12 G dom(/)) be minimal such that p{i2) = t. If t = m\ then set i% := ii- 
Otherwise there is some minimal i\ < 72 such that p{i\) = m±: note that a stack operation 
alters either the width of the stack or the content of the topmost word. Thus, before 
reaching t, p must visit some stack rh of width at most \t\ and of the form m = pop 2 fc (si) 
for some k G N. Since m cannot be a substack of t, m = m\. Thus, we set p\ := /offo,^] an d 
P2 ■= P\[i u i 2 }- 

For the definition of ps and p^, note that \m,2\ = \t\. Let 72 < ^3 G dom(/?) be maximal 
such that |/o(«3)| = \t\. Since the first \t\ words of the stack are not changed by p after 73, 
p(i$) = 777-2 and 73 is the last occurrence of 777-2. Setting p^ := p\u 2) i 3 ] and p^ := pr[t 3 ,iength(p)] 
we are done. □ 

This decomposition motivates the following definition. 

Definition 4.2. Given a collapsible pushdown system S, we define the following four rela- 
tions on the configurations of S: 

R^ := {(01,02) G Cnf 2 : C2 = pop 2 fc (ci) and 3p G Runs(ci, C2) Vi < length(p) p(i) ^ C2} 

:= {(01,02) G Cnf 2 : C2 = pop 1 fc (ci) and 3p G Runs(ci, C2) V7 < length(/7) p(i) ^ C2} 

it^ := {(01,02) G Cnf 2 : ci = pop 1 fc (c2) and 3p G Runs(ci, C2)V7 < length(/7) ^ ci} 

:= {(ci,c 2 ) G Cnf 2 : c\ = pop 2 fc (c 2 ) and 3p G Runs(ci, c 2 ) Vf > p(i) ^ ci}. 

Remark 4.3. Since we allow runs of length 0, the relations R^, R^, R^ and R^ are 
reflexive. Lemma 14.11 states that 

(ci, c 2 ) G Reach 44> 3d, e, / (ci, d) G A (d, e)£^A (e, /) G i?^ A (/, c 2 ) G /T*. 

In Section [5.21 we show that the relations R^,R^,R^ and R^ are automatic whence 
Reach is also automatic. In the next section, we prove a decomposition result that especially 
applies to all runs in Rr^ . Afterwards, in Section T4. 31 we provide a corresponding decompo- 
sition result for all runs in R^ . Finally, we provide (much simpler) decompositions for R^ 
and Rfi in Section @TU 

4.2. Milestones, Loops and Increasing Runs. In this section we aim at a decomposi- 
tion result for all runs in R^~. For this purpose we first introduce the notion of generalised 
milestones of some stack s. The underlying idea is as follows. Some stack m is a generalised 
milestone of s if any run from the initial configuration to s of any collapsible pushdown sys- 
tem also passes m. Moreover, if some run ending in s passes some generalised milestone 
777 of s, then it passes all generalised milestones of s that are not generalised milestones of 
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m. From the definition of generalised milestones it will be obvious that every run p in Br* 
starts at a generalised milestone of its final stack. Thus, a run in B^~ can be decomposed 
into parts that connect one generalised milestone of its final stack with the next generalised 
milestone. After introducing the precise notion of a loop we will see that each of these parts 
consists of such a loop plus one further transition. At a first glance, the formal definition 
of a generalised milestone has nothing to do with our informal description. The connection 
between the intended meaning and the formal definition is that in order to create some 
stack s from the initial stack _l_2, we have to create it word- by- word and each word letter- 
by- letter in the following sense. If we want to create s = W\ : W2 ■ ■ ■ ■ ■ Wk, we have to 
use push operations in order to create the first word, i.e., the stack [w{\. Then we have to 
apply clone2 and obtain w\ : w\. In order to generate s from this stack, we first have to 
generate w\ : W2 and then we can proceed generating the other words of s. But for this 
purpose, we first have to remove every letter from the second copy of w\ until we reach the 
greatest common prefix of w\ and wi- This can only be done by iteratively applying pop 1 
or collapse operations of level 1. Having reached W\ : {w\ n W2), we again start to create 
w\ : u>2 by using push operations of level 1. 

This way of creating s from ±2 is the shortest method to create s which is unique up to 
replacements of pop 1 operations by collapse of level 1 and vice versa. At the same time any 
other method contains this pattern as a (scattered) subsequence (again up to replacement of 
pop 1 by collapse of level 1 and vice versa). If we deviate from the described way of creating 
s, then we just insert some loops where we first create some different stack and then return 
to the position where we started to deviate. At the end of this section, Corollary 14.101 will 
show that our intuition is correct. Let us now formally define generalised milestones. 

Definition 4.4. Let s = W\ : W2 ■ ■ ■ ■ ■ Wk be a stack and let wq = _L We call a stack m a 
generalised milestone of s if m is of the form 

m = w\ : W2 Wi : Vi+i where < i < k, 

Wi n Wi + \ < Vi + \ and 

Vi+i < Wi or v i+ i < Wi+x- 

We denote by GMS(s) the set of all generalised milestones of s. 

For a generalised milestone m of s, we call m a milestone of s if m is a substack of s, 
i.e., if t> j+i < tUj+i in the above definition. We write MS(s) for the set of all milestones of s. 

We next define a partial order that turns out to be linear when restricted to the set 
GMS(a). 

Definition 4.5. We define a partial order on all stacks as follows. is the smallest re- 
flexive and transitive relation that satisfies the following conditions. Let s = w\ : W2 wt 
and t = vx V2 vi be stacks, s t holds if 

(1) \s\ < \t\, or 

(2) I = k, Wi = Vi for i < k and Vk < Wk < Wk-x = ^fc-i, or 

(3) I = k, Wi = Vi for i < k and Wk < Vk j£ Wk-l = Vk-x- 

For each stack s, we now characterise restricted to GMS(s). The straightforward 
proofs of the following lemmas are left to the reader. 

Lemma 4.6. Let s = w± : W2 Wk be a stack and mx,rn,2 € GMS(s). Then m\ <^ m-2 

if one of the following holds: 
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i, Wi-i n Wi < top 2 (mi) < itfj-i and Wi-\ n wi < top 2 (m 2 ) < w^j 
z and Wi-i nwj < top 2 (m 2 ) < top 2 (mi) < Wi-\, or 
i and itfj-i nw; < top 2 (mi) < top 2 (m 2 ) < Wi. 

Lemma 4.7. For each stack s, <C induces a finite linear order on GMS(s). Moreover, if 
m G GMS(s) then (GMS(m), <) is the initial segment of (GMS(s),<C) up to m. 

We call some m G GMS(s) the z-th generalised milestone if it is the i-th element of 
GMS(s) with respect to <C. Later we will see that <C corresponds to the order in which the 
generalised milestones appear in any run from the initial configuration to s. Note that the 
restriction of <C to MS(s) coincides with the substack relation <. 

Now we introduce loops formally and characterise runs connecting generalised mile- 
stones in terms of loops. Later we will see that there is a close correspondence between the 
milestones of some stack s and the nodes of our encoding Enc(s). This correspondence is 
one of the key observation in proving that Enc(s) yields a tree-automatic encoding of the 
relation . 

Definition 4.8. Let s be a stack and q, q' states. A loop from (q, s) to (q 1 , s) is a run 
A G Runs((g, s), (q 1 , s)) that does not pass a substack of pop 2 (s) and that may pass pop 1 fe (s) 
only if the k topmost elements of top 2 (s) are letters with links of level 1. This means that 
for all i G dom(A), if X(i) = pop 1 fc (s)) then CLv^pop^'fs)) = 1 for all < k' < k. 

If A is a loop from (q, s) to (q' , s) such that A(l) = pop 1 (s) = A(length(A) — 1), then we 
call A a low loop. If A is a loop from (q, s) to (q', s) that never passes pop 1 (s), then we call 
A a high loop. 

For s some stack, we sometimes write A is a loop of s. By this we express that A is a 
loop and its initial (and final) stack is s. We now characterise runs connecting milestones 
of some stack in terms of loops. 

Lemma 4.9. Let p be a run ending in stack s = w\ : W2 w^. Furthermore, let 

m G GMS(s) \ {s} be such that p visits m. Let i G dom(p) be maximal such that p visits 
m at i, i.e., the stack at p(i) is m. Then p also visits the -^.-minimal generalised milestone 
ml G GMS(s) \ GMS(m) and for i' G dom(p) maximal such that p{i') = ml ', p\^ i+ i^ is a 
loop of ml . 

Proof. We distinguish the following cases. 

• Assume that ml = clone 2 (m). In this case m = w\ : iw 2 : • • • : uji m \. Thus, at the last 
position j G dom(p) where \p(j)\ = \m\, the stack at p(j) is m (because p never changes 
the first \m\ many words after passing p(j)). Hence, i = j by definition. Since |s| > \m\, 
it follows directly that the operation at i is a clone 2 leading to ml . Note that p never 
passes a stack of width \m\ again. Thus, it follows from Lemma 12.111 that for il maximal 
with p(i') = ml the run /olWi^'i never visits pop 1 fc (m') if CLvl(pop 1 fc_1 (m / )) = 2. Thus, 
we conclude that pfji+i^'] is a loop. 

• Assume that ml = pop 1 (m). In this case, m = W\ : ■ ■ ■ ■ : w for some w such 
that w\ m i_i n W\ m > < w < Thus, w ^ i«| m | and creating w\ m \ as the |m|-th word 
on the stack requires passing w\ : u; 2 : • • • : w\ m \_i : wi m i_i n w\ m \. This is only possible 
via applying pop! or collapse of level 1 to m. Since we assumed i to be maximal, the 
operation at i must be popx or collapse of level 1 and leads to ml . 



(1) \m\\ < [m 2 

(2) \m\\ = \m<i 

(3) \m\\ = [m 2 

(4) \m\\ = |m 2 
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We still have to show that /offi+i i'i is a loop. By definition of i, plu+i^n starts and ends 
in m! . Due to the maximality of i, plu+ij'] does not visit the stack pop 2 (m) = pop 2 (m'). 
Furthermore, top 1 (m') is a cloned element. Due to Lemma l2.114 if ptu+i^/i visits pop 1 fc (m / ) 
then CLvl(pop 1 fc_1 (m')) = 1. Thus, /offj+i^'i is a loop. 
• The last case is m! = push CT ;(m) for (<r, I)gSx {1, 2}. In this case, 

m = w\ : u>2 ■ ■ ■ ■ : w\ m \~i '■ w 

for some w such that tt)i TO i_i n w\ m \ < w < w\ m \. Creating w\ m \ on the stack requires 
pushing the missing symbols onto the stack as they cannot be obtained via clone operation 
from the previous word. Since i is maximal, the operation at % is some push CT ; leading 
to m' . pfuj/i is a high loop due to the maximality of i (this part of p never visits 
m = pop 1 (m') or any other proper substack of m'). □ 

As a corollary of the lemma, we obtain that <C coincides with the order in which the gener- 
alised milestones appear for the last time in a given run starting in the initial configuration. 

Corollary 4.10. Let s be some stack and mi € MS(s). Some run p € Runs(mi,s) that 
does not visit substacks of mi (after the initial configuration) decomposes as 

p = p x o Ai o • • • o p n o \ n 

where the Aj are loops and the pi are runs of length 1 that connect one generalised milestone 
of s with its ^-successor (in GMS(s)j. 

In particular, for t = pop 2 fc (s) and configurations c = (q,t) and d = (q',s) a run 
p G Runs(c, d) witnesses (c, d) € Rr* if and only if it decomposes as given above. 

For the direction from right to left of the last claim note that, if t = pop 2 fc (s) and 
p € Runs(£, s) decomposes as above, then pi performs a clone 2 . This implies that the run 
Ai o pi o • • • o p n o A n cannot visit t again. 

This corollary shows that it is sufficient to understand loops of generalised milestones 
of a given stack s in order to understand runs in ending in s. In Section [4.61 we show 
that the loops of a given stack s can be computed by a finite automaton on input top 2 (s). 

4.3. Returns, 1-Loops and Decreasing Runs. Having analysed the form of runs in 
, we now analyse runs in the converse direction: how can we decompose a run in R^? 
We need the notion of returns and of level-l-loops in order to answer this question. 

Definition 4.11. Let t = s : w be some stack with topmost word w. A return from t to 
s is a run p € Runs(i, s) such that p never visits a substack of s before length(p) and such 
that one of the following holds: 

(1) the last operation in p is pop 2 , or 

(2) the last operation in p is a collapse and w < top 2 (p(length(p) — 1)), i.e., p pushes at 
first some new letters onto t and then performs a collapse of one of these new letters, 
or 

(3) there is some i G dom(p) such that p\u\ eng tb.(p)] * s a return from pop 1 (t) to s. 

Remark 4.12. The technical restrictions in the second condition have the following inten- 
tion. A return from t to pop 2 (t) is a run p from t to pop 2 (t) that does not use the level 2 
links stored in top 2 (t) (cf. pT| for a detailed discussion). 
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It is useful to note that any return from some stack s that visits popx (s) in fact satisfies 
the last condition in the definition of return. 

Lemma 4.13. Let p be some return from a stack s to pop 2 (s). If < i < length(p) is the 
first position such that p visits pop 1 (s) at i, then pt[i,iength(p)] * s a return from pop 1 (s) to 
pop 2 (s), i.e., i witnesses that p satisfies the third condition in the definition of a return. 

In the case that the topmost word of the initial stack only contains cloned elements, 
then there is an easy condition to verify that a run starting at this stack is a return. 

Lemma 4.14. Let s and t be stacks. Assume that \t\ < \s\ and that top 2 (s) < top 2 (i). For 
p some run of length I starting at stack s, p is a return if 

(1) for all < i < I, \p(i)\ > \s\ and 

(2) |i| < |p(0| < 

Proof. The proof is by induction on the length of top 2 (s). Note that the operation at p(l — 1) 
is either pop 2 or a collapse of level 2. If it is a pop 2 , then it is immediate that p is a return. 
If it is a collapse of level 2, then there is a prefix w < top 2 (s) and some word v such that 
top 2 (p(/ — 1)) = wv. Since all level 2 links in top 2 (s) point to stacks of width smaller than 
t, v must be nonempty by assumption ([2]). Thus, if w = top 2 (s) we conclude immediately 
that p is a return, if w < top 2 (s) then the only way to create a word wv on a stack of width 
at least \s\ with a link to a stack of smaller width requires to visit pop 2 (s) : w. But this 
implies that p visits pop 1 (s) and we conclude by application of the induction hypothesis to 
the final part of p starting at the first occurrence of pop 1 (s). □ 

Definition 4.15. Let s be some stack and w some word. A run A of length n is called a 
level-l-loop (or 1-loop) of s : w if the following conditions are satisfied. 

(1) A G Runs(s : w,s : s' : w) for some 2-word s' , 

(2) for every i G dom(A), > \s\, and 

(3) for every % G dom(A) such that top 2 (A(i)) = pop 1 (u;), there is some j > i such that 

is a return. 

Before we analyse the form of runs in , we prove an auxiliary lemma. 

Lemma 4.16. Let p be a run of length I starting in some stack s with topmost word w such 
that 

(1) p does not visit a substack o/pop 2 (s) before its final configuration, 

(2) top 2 (p(7 — 1)) is a proper prefix of w, and 

(3) the last stack operation in p is a collapse of level 2. 

If p is not a return, then there is some <i < I such that 

(1) top 2 0(i)) = pop^), 

(2) for all i < j < I the subrun p\uj] is not a return, and 

(3) for all < j < i such that top 2 (p(j)) = pop 1 (w) there is a j < k < i such that ptu-w is 
a return. 

Proof. Assume that p is such a run and that it is not a return. We first prove that there is 
some < i < I such that 

• top 2 (p(i)) =pop 1 (iu) and 

• for alH < j < I the subrun p\u } j] is not a return. 

Afterwards we show that the minimal such i also satisfies claim ([3|). 
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(1) Assume that the stack at p(l — 1) decomposes as wi : w 2 Wk such that w j£ w\ a \- 
In this case let i < I — 1 be minimal such that \p(i)\ = \s\ and w j£ top 2 (p(i)). Since p 
does not visit a substack of pop 2 (s) before i, we conclude immediately that the stack 
at i — 1 is s and the stack at i is pop 1 (s). We show that i witnesses the claim. Note 
that top 2 (p(i)) = pop 1 (u;). Heading for a contradiction, assume that there is some 
i < j < I such that p' := p\uj\ is a return. This assumption implies directly that 
\p(j)\ < IpWI = l s l whence p' visits a substack of pop 2 (s). Since p does not do so 
before I, we conclude that I = j. But this implies that p is a run that starts in s, passes 
pop 1 (s) and continues with a return from pop 1 (s). By definition, this implies that p is 
a return contradicting our assumptions. Thus, we conclude that there is no j > i such 
that p\[ij] is a return. 

(2) Otherwise, the stack at p{l — 1) decomposes as w\ : w 2 ■ • • • ■ Wk for some k > \s\ and 
there is some n > \s\ such that w < Wi for all \s\ < i < n and w j£ w n . Let i$ < I — 1 
be maximal such that \p(io — 1)| < n. Then the stack at p(io — 1) is ioi : W2 ■ ■ ■ ■ ■ w n -l 
and the operation at iq — 1 is clone 2 . Thus, w < top 2 ( / o(io)). Let io < %\ < I — 1 be 
minimal such that |p(ii)| = n and top 2 (p(?i)) < u;. By minimality of zi, we conclude 
that \p(ii — 1)| = n whence top 2 (p(h — l))=w and top 2 (p(ii)) = pop 1 (w). In order to 
prove that i\ witnesses the claim of the lemma, we have to prove that for all i\ < j < /, 
pt[iij] is not a return. But note that |/o(ii)| = n < \p(j)\ for all i\ < j < I whence 
p\[i 1 j] is not a return for i\ < j < /. Moreover, since p ends in a collapse on a prefix of 
w = top 2 (/?(0)), we conclude that \p(l)\ < |p(0)| = [ -s ] < n. Thus, |/o(ii)| — |(p(OI ^ 2 
and pt[ti j] is not a return because it does not end in pop 2 (p(ii)). 

This completes the first part of our proof. We still have to deal with claim ([3]). For this 
purpose let < i < I be minimal such that 

• top 2 (p(i)) =pop 1 (u>) and 

• for alH < j < I the subrun p\uj] is not a return. 

Heading for a contradiction, assume that there is some < j < i with top 2 (p(j)) = pop 1 (w) 
such that there is no j < k < i such that p\uu is a return. 

By minimality of i there is some k > i such that p,- := ptyM is a return. Since p is no 
return, we directly conclude that p(J) ^ pop 1 (s) whence \p(j)\ > \s\. We distinguish two 
cases. 

(1) If \p(j)\ > \p(i)\ then the minimal ko > j such that |p(A;o)| < \p(j)\ satisfies ko < %■ But 
by definition, ko is the only candidate for ply^o] being a return. Thus, k = ko which 
contradicts the assumption that k > i. 

(2) If \p(j)\ < \p(i)\, we conclude that \s\ < \p(k)\ < \p(i)\- Thus, there is also a minimal 
ko > i such that \s\ < \p(ko)\ < \p(i)\- Since top 2 (p(i)) < top 2 (s) = w, we conclude 
with Lemma [4.141 that pt[i,fe ] i s a re t urn contradicting our choice of i. 

□ 

With this lemma we are prepared to prove our decomposition result. 

Lemma 4.17. Let s, s' £ Stck(S) such that s' = pop 2 fc (s) for some k € N and let p be some 
run. p is a run in Runs(s, s') that does not visit a substack of s' before its final configuration 
if and only if p G Runs(s, s') and p decomposes as p = p\ o p 2 o • • • o p n where each pi is of 
one of the following forms. 
Fl. pi is a return, 
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F2. pi is a 1-loop followed by a collapse of collapse level 2, 

F3. pi is a 1-loop followed by a pop 1 (or a collapse of collapse level 1 ), there is a j > i such 
that pj is of the form i{H and for all i < k < j p^. is of the form 

Proof. First of all, note that the case length(p) = is solved by setting n := 0. 

We proceed by induction on the length of p. Assume that p E Runs(s,s') and that it 
does not visit s' before the final configuration. We write (qi,Si) for the configuration p(i). 
Firstly, consider the case that there is some m E dom(/)) such that p\ := pffo )m ] is a return. 
Then p\ is of the form FUJ By induction hypothesis, /ot[m,length(p)] decomposes as desired. 

Otherwise, assume that there is no m E dom(p) such that /0|"[o,to] is a return. 

Nevertheless, there is a minimal m E dom(p) such that \s m \ < \s\. The last operation 
of p := /o|"[o, m ] is a collapse such that top 2 (s m -i) < top 2 (s) (otherwise p would be a return). 

Writing w := top 2 (s m _i), we distinguish two cases. 

(1) First consider the case that w = top 2 (s). Note that this implies CLvl(s) = 2 because 
the last operation of p is a collapse of level 2. 

Furthermore, we claim that p does not visit pop 1 (s). Heading for a contradiction, 
assume that p{i) = pop 1 (s) for some i E dom(p). Since p does not visit pop 2 (s) between 
i and m — 1, top 2 (p(m — 1)) = w is only possible if CLnk(ii;) = \s\ — 1 (cf. Lemma f2. lip . 
But then p\u m i is a return of pop 1 (s) whence by definition p is a return of s. This 
contradicts our assumption. 

We claim that p is a 1-loop plus a collapse operation: we have already seen that p 
does not visit any proper substack of s. Thus, it suffices to show that p reaches a stack 
with topmost word pop 1 (u;) only at positions where a return starts. 

Let i be a position such that top 2 (p(i)) = pop^iy). Recall that top 2 (s m _i) = w, 
CLvl(io) = 2 and CLnk(u;) < \s\ — 1. Since p does not visit pop 1 (s), \p(i)\ > \s\ and 
we cannot restore top 1 (w;) by a push operation. Thus, there is some minimal position 
m > j > i such that \p{j)\ < \p(i)\- Lemma f4. 141 implies that p\[ij] is a return. 

Thus, pi := p is of the form Fj2j 

(2) For the other case, assume that w < top 2 (s). Since p is not a return, we may apply the 
previous lemma. We conclude that there is some i E dom( / o) such that p\ := /o|"[o,i] i s a 
1-loop followed by a pop 1 or a collapse of level 1 and such that there is no j > i such 
that p\uj] is a return. In order to show that p is of the form Ff3]we have to check the 
side conditions on the segments following in the decomposition of p. For this purpose 
set p' := pfr^iaogthfp)]' By induction hypothesis p' decomposes as p' = p 2 o p 3 o • • • o p n 
where the pi satisfy the claim of the lemma. 

Now, by definition of i, p' does not start with a return. Thus, p 2 is of one of the 
forms F[2]or Fj3l But these forms require that there is some j ' > 2 such that pj is of form 
F[2]and for all 2 < k < j, p^ is of the form F31 From this condition it follows directly 
that p = pi o p' = pi o p 2 o p 3 o • • • o p n and p\ is of the form F31 □ 

For the other direction, assume that p E Runs(s, s') decomposes as p = p\ o p 2 o • • • o p n 
where each pi is of one of the forms F1-F3. Let i\ < i2 < ■ ■ ■ < ik = n be the subsequence 
of subruns of the forms Fl or F2. Let Sj. be the stack of the final configuration of p^ . A 
straightforward induction shows that jsjj > |sj 2 | > • • • > |sj fe | and that all stacks that occur 
after the final configuration of pi- and before the final configuration of Pi J+1 have width at 
least \sij \. Analogously, all stacks occurring before the final configuration of pi l have width 
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at least \s\. Thus, we conclude immediately that substacks of s' cannot be visited before 
the final configuration of p. 

4.4. Decompositions for Runs in or R^. The decomposition of runs witnessing 
that (01,02) G R^ or (01,02) G Rr' , respectively, turns out to be very useful for proving 
tree-automaticity of R^ and R^*. We use similar characterisations for runs witnessing 
that certain pairs of configurations are contained in R^ or R^. The proofs of the following 
characterisations are straightforward inductions. 

Lemma 4.18. Let ci,c 2 G Cnf and p some run. 

(1) p witnesses (01,02) G R^ if and only if p G Runs(ci,C2) and p decomposes as 

p = Ai o p x o \ 2 o p 2 o ■ ■ ■ o A n o p n 

where each Aj is a high loop and each pi is a run performing exactly one transition which 
is pop-L or a collapse of level 1. 

(2) Analogously, p witnesses (01,02) G R^ if and only if p G Runs(ci,C2) and p decomposes 
as p = Ao o pi o Ai o p 2 o • • • o A n _i o p n o A n where the Aj are high loops and each pi 
performs exactly one push operation. 

4.5. Computing Returns. In this section, we prove that the existence of returns starting 
at a given stack s inductively depend on the returns starting at pop 1 (s). Later we use this 
result in order to show that there is a similar dependence of loops starting in s from loops 
and returns starting in pop 1 (s). Let us fix a CPS S. For some word w occurring in a level 
2 stack, let w4 denote the word where each level 2 link is replaced by 0, i.e., each (a, 2,i) 
is replaced by (a, 2, 0). 

Definition 4.19. Set Rt(u>) := {(q,q') : there is a return from (q,w], : it4 ) to (q',wl )}. 
We also set Rt(s) := Rt(top 2 (s)). 

The main goal of this section is the proof of the following proposition. 

Proposition 4.20. There is a finite automaton with 2^ x ^ many states that computes 
Rt(w) on input w4 - 

Remark 4.21. In fact, the automaton can be effectively constructed from a given CPS 
(cf. The same holds analogously for Proposition 14.291 and Corollary 14.381 

This proposition relies on the observation that returns of a stack s with topmost word 
w are composed by runs that are prefixed by s and by runs that are returns of stacks with 
topmost word pop 1 (w). Furthermore, it relies on the observation that stacks with equal 
topmost word share the same returns. The reader who is not interested in the proof details 
may safely skip these and continue reading Section f4. 61 

Lemma 4.22. Let s be a stack with \s\ > 2. There is a return from (q,s) to (q',pop 2 (s)) 
if and only if (q, q') G Rt(top 2 (s)). 



Proof. Let w := top 2 (s)J,o- A tedious but straightforward induction on the length of the 
return provides a transition-by-transition copy of a return starting at (q, s) to a return 
starting at (g, w : w) and vice versa. □ 
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The following auxiliary lemmas prepare the decomposition of returns into subparts that 
are returns starting at stacks with smaller topmost words and subparts that are prefixed 
by the first stack of the return. 

Lemma 4.23. Let p be a return starting at some stack s with top 2 (s) = w. If p visits 
pop 1 (s), let k be the first occurrence of pop 1 (s) in p, otherwise let k := length(p). If 

< i < k is a position such that s < p(i — 1) and top 2 (/o(i)) = popi(w) then there is some 

1 < j < k such that pfuji is a return. 

Proof. Since i < k, the stack at p(i) is not pop 1 (s). Thus, the stack at p(i) is of the form 
s' : popi(w) with s<s', in particular |s| < |p(i)|. There is a minimal i < j < k such that 

IpO')I < IpOOI- 

If j < length(/?), we conclude by application of Lemma 14.141 Otherwise, j = length(/)) 
and p does not visit pop 1 (s). Since p is a return, the operation at j — 1 is pop 2 (whence 
p\[ij] is a return) or there is a nonempty word v such that top 2 (p(j — 1)) = wv and the 
operation at j — 1 is a collapse of level 2. Since v was created between i and j — 1 its topmost 
link points to a stack of width at least \p{i) \ — 1 and we conclude again with Lemma 14.141 
that p\\ij] is a return. □ 

Lemma 4.24. Let p be some run, s some stack with topmost word w := top 2 (s) such that 
the following holds. 

(1) s<p(0), 

(2) p(i) <jt s for all < i < length(p), and 

(3) for all < i < length(p) such that s<p{i — 1) and w % top 2 (p('i)), there is some 
i < j < length(p) such that p\\ij] is a return. 

There is a well-defined sequence 

:= jo < h < h < h < h < ■ ■ ■ < in < jn < in+i ■= length (p) 
with the following properties. 

(1) Forl<k<n + l,s<p\ [jk _ lM . 

(2) For each 1 < k < n, there is a stack Sk with top 2 (sfc) = pop 1 (w) such that p\u k +i t j k ] is 
a return from Sk to pop 2 (s/ c ). 

(3) For all 1 < k < n, top 2 (/o(ifc)) = w and the operation at i^ in p is a pop! or a collapse 
of level 1. 

Proof. The proof is by induction on length(p). If s < p (in particular, if length(p) = 0), set 
n := and we are done. Otherwise, let jo := and let i\ E dom(p) be the minimal position 
such that s<p{i\) but s^p{ii + 1). Since p{i\ + 1) -ft s, the stack at p(i\ + 1) must be 
of the form s' : pop 1 (u;) for some s' such that s < s' . This requires that the stack at p{i\) 
is s' : w and the operation at i\ is pop x or collapse of level 1. By assumption on p, there 
is some %\ + 1 < ji < length(p) such that plu^ij^ is a return. Thus, the stack at p(ji) 
is s' whence s ^p(ji). Thus, we can apply the induction hypothesis to p\[j lt iength(p)] which 
settles the claim. □ 

The previous lemma allows to classify returns as follows. 

Corollary 4.25. Let p be a run starting in some stack s with topmost word w = top 2 (s). p 
is a return from s to pop 2 (s) that does not pass pop 1 (s) if and only if p € Runs(s, pop 2 (s)) 
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and there is a uniquely defined sequence 

:= jo < h < jl < h < h < ■ ■ ■ < ^ < 3n < in+i ■= length(p) - 1 
with the following properties. 

(1) Fbrl<k<n + l,s<p\ Vh _ lM . 

(2) For each 1 < k < n, there is a stack s/- with top 2 (sfc) = pop 1 (w) such that p\u k +i t j k ] is 
a return from Sk to pop 2 (sjt). 

(3) For all 1 < k < n, top 2 (p(ifc)) = w and the operation at ik in p is a pop! or a collapse 
of level 1. 

(4) Either w is a proper prefix of top 2 (/o(i n +i)) and the operation at i n +i is a collapse of 
level 2 or w is a prefix of top 2 (p(i n +i)) and the operation at i n +\ is a pop 2 . 

Proof. First assume that p is such a return. Due to Lemma [4. 231 we can apply Lemma [4. 241 
t° Pf[o,iength(p)-i]- This gives immediately the first three items. The last item is a direct 
consequence of the definition of a return. 

Now assume that p is a run from s to pop 2 (s) that satisfies conditions {I])-®. Heading 
for a contradiction assume that p visits pop 1 (s). Due to (pQ) this happens at some position 
ik + 1 < J < jk ~ 1- D ue to ([2]) we conclude that the width of the stack at p(jk) is smaller 
than the width at j. But this contradicts condition 1. because s < p(jk)- 

For similar reasons p does not visit a substack of pop 2 (s) before the final configuration. 
Thus, condition (jj]) implies that p is a return. □ 

Corollary 4.26. Let p be a run starting in some stack s with topmost word w = top 2 (s). 
p is a return from s to pop 2 (s) that passes pop 1 (s) if and only if p € Runs(s, pop 2 (s)) and 
there is a uniquely defined sequence 

:= jo < h < ji < ii < jz < ■ ■ ■ < i n < jn = length(p) 

with the following properties. 

(1) Forl<k<n, s<p\ bk _ 1)ik} . 

(2) For each 1 < k < n, there is a stack s^ with top 2 ( s k) = P Pi( w ) such that p\[i k +ij k ] is 
a return from Sk to pop 2 (s/t). 

(3) For all 1 < k < n, top 2 (p(ifc)) = w and the operation at ik in p is a pop 1 or a collapse 
of level 1. 

Proof. First assume that p is such a return. Let < k < length (p) be the first occurrence of 
pop 1 (s) in p. Application of Lemma 14.241 to pfro,*;— l] yields a decomposition into s-prefixed 
parts and returns (ending with an s-prefixed part). Finally, due to Lemma 14. 131 jOt[/~ ; i cn gth(p)l 
is also a return. 

Now assume that p is a run from s to pop 2 (s) that satisfies conditions ([I])-©. As in 
the previous corollary, we conclude that p does not visit substacks of pop 2 (s) before the 
final configuration. Due to condition ([2]), p(i n + l) = pop 1 (s) and p\[i n +i t \ eQg th(p)] 1S a return 
whence p is also a return. □ 

In the following corollary, we assume that Rt(e) = 0. 

Corollary 4.27. For each stack s, Rt(s) is determined by Rt(pop 1 (s)), Sym(s) and CLvl(s). 

Proof. Let w and w' be words such that Sym(w) = Sym(u/), CLvl(w) = CLvl(u/) and 
Rt(pop 1 (w)) = Rt(pop 1 (t(; / )). Fix a return p starting in (qi,s) for s := w^q : wl and 
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ending in (q2,w\. Q ). We have to prove that there is a return p' from (qi, s') to (</2, w'io) f° r 
s' := w'l : w'Iq. 

The proof is by induction on \w\. Assume that p does not visit pop 1 (s) and let 

:= jo < k < ji < h < h < ■ ■ ■ < in < in < i n +i ■= length(p) - 1 

be the sequence according to Corollary 14.251 If p visits pop 1 (s), use the sequence ac- 
cording to Corollary 14.261 and proceed analogously. For all k < n + 1, s < p\\j k _ l7 i k ]- Set 
Pk '■= P\[j k _ 1 ,i k ] i s / s '] (°f- Lemma f2.15p . This settles the claim if n = 0. 

For the case n > 0, note that r^ fc+1 - fc i is a return starting at some stack with topmost 
word pop 1 (w). Thus, Rt(pop 1 (w)) = Rt(pop 1 (ti;')) ^ whence w and w' are words of length 
at least 2. Let 8k be the transition connecting p(ik) with p{ik + 1). Note that 8k is either a 
pop! transition or a collapse transition and CLvl(p(ifc)) = 1. Note that top 2 (/o(ifc)) = wi 
whence top 2 (pfe (length (pk))[ s / s> ] = w 'lo- Hence, 8k is applicable to the last configuration 
of pk and leads to a configuration Ck with topmost word pop 1 (w'l Q ). 

Recall that p\\i k +i t j k ] is a return starting at some stack with topmost word pop 1 (wl ). 
Since Rt(pop 1 (io)) = Rt(pop 1 (it?')), Lemma [4.221 provides a return p' k from to pk+\(0)- 

Finally, let 7 be the transition connecting p(i n+ \) with p{i n +i + 1) = P (length (p)). 
7 connects the last configuration of p n +i with (g2,uA|, ) = (^2i pop 2 (s')): either 7 is a 
pop 2 transition and |p(%_|_i)| = 2 = |p n+ i(length(/9 n+ i))| or 7 is a collapse transition and 
top 1 ( / 9(i n+ i)) = (cr,2,l) = top 1 (p(i n+ i ))[s/s'] = top 1 (p n+ i(length(p n+1 ))). Thus, 

p' := pi o 81 o p' x o p 2 o ■ ■ ■ o p„ o8 n o p' n o p n+ i o 7 

is a return from (qi,s') to (g 2 , pop 2 (s')). □ 

Proposition 14.201 which states that Rt(w) can be computed by a finite automaton, is a 
direct corollary of the previous lemma: Rt(u>) is a subset of Q x Q. An automaton in state 
Rt(pop 1 (w)) can change to state Rt(w) on input Sym(w) and CLvl(u>). 

4.6. Computing (1-) Loops. In analogy to the results of the previous section, we now 
investigate the existence of loops. We follow exactly the same ideas except for the fact that 
a loop of some stack s depends on the loops and returns of pop 1 (s). At the end of this 
section, we provide a similar result for 1-loops. 

Definition 4.28. Set Lp(w) := {(q,q') : there is a loop from (q,wl ) to (q',wl )}. Simi- 
larly, let hLp(w) and £Lp(w) be the analogous sets for high loops and low loops, respectively. 
Set lLp(w) := {(q, q') : there is a stack s and a 1-loop from (q, wl ) to (q',s : u>4o)}- We 
also set Lp(s) := Lp(top 2 (s)) and analogous for hLp, £Lp and lLp. 

Extending the result of the previous section, our main goal is the following automaticity 
result for Lp and lLp. 

Proposition 4.29. There is a finite automaton A with 2^ x ®\ ■ 2^ x ^ ■ |S| 2 • 2 many states 
that computes Kt(w), Lp(w), hLp(w), £Lp(w), lLp(ti;), Sym(u>) and CLvl(w) on input w\ 
(where w is a word occurring as topmost word of some stack s, i.e., for w = top 2 (s)J. 

The reader who is not interested in the proof details, can safely skip the rest of this 
section and continue with Section [5l In the following, we mainly use the same arguments as 
in the return case, but we have to consider loops of the stack pop 1 (s) because those occur 
as subruns of low loops of s. We omit proofs whenever they are analogous to the return 
case. 
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Lemma 4.30. Let s be some stack. There is a loop from (q,s) to (q',s) if and only if 
(QI1Q2) G Lp(top 2 (s)). The analogous statement holds for hLp, £Lp, and lLp. 

The next step towards the proof of our main proposition is a characterisation of Lp(w) 
in terms of Lp(pop 1 (w)) and Rt(pop 1 (u;)) analogously to the result of Corollary 14.251 for 
returns. We do this in the following three lemmas. First, we present a unique decomposition 
of loops into high and low loops. Afterwards, we characterise low loops and high loops. 

Lemma 4.31. Let X be a loop from (q, s) to (q' , s). X is either a high loop or it has a unique 
decomposition as X = Xq o Ai o A2 where Xq and A2 are high loops and X\ is a low loop. 

Proof. If A is not a high loop, let i G dom(A) be the minimal position just before the first 
occurrence of pop 1 (s) and j G dom(A) be the position directly after the last occurrence of 
pop 1 (s). By definition, an d A t[j,iength(A)] are high loops and M[iJ\ * s a l° w loop. □ 

Corollary 4.32. The set Lp(s) can be computed from the sets hLp(s) and ILp(s) via 
Lp(s) := hLp(s) U {(q, q') : 3qi,q 2 (q, qi) G hLp(s), (qi,q 2 ) G ^Lp(s), and (q 2 ,q') G hLp(s)}. 

In the following, we first explain how low loops depend on the loops of smaller stacks, 
afterwards we explain how high loops depend on returns of smaller stacks. 

Lemma 4.33. Let X be a low loop starting and ending in stack s. Then A r[i,iengtti(A) 1] * s 

a loop starting and ending in pop 1 (s). The operation at is a pop 1 or a collapse of level 
1. The operation at length(A) — 1 is a push CT where top 1 (s) = a G X. 

Proof. Note that each low loop A satisfies A(l) = pop 1 (s) = A(length(A) — 1). Since 
pop 2 (pop 1 (s)) = pop 2 (s), it follows directly that the run in between satisfies the defini- 
tion of a loop. □ 

Corollary 4.34. £Lp(s) depends on Sym(s), CLvl(s), Sym(pop 1 (s)) and Lp(pop 1 (s)). 

Note that Sym(s) and CLvl(s) determine whether the first transition of a low loop can 
be applied to the stack and that Sym(pop 1 (s)) determines whether the last transition of a 
low loop can be applied. 

In analogy to the return case, we provide a decomposition of high loops which shows 
that hLp(s) is determined by the returns of pop 1 (s) and by the topmost symbol and link 
level of s. 

Lemma 4.35 (cf. Corollary I4.25H . Let X be some run starting in some stack s with topmost 
word w = top 2 (s). X is a high loop from s to s if and only if X £ Runs(s, s) and there is a 
sequence =: jo < %\ < j\ < i 2 < j 2 < ■ ■ ■ < i n < jn < in+i '■= length(A) such that 

(1) for 1 < k < n + 1, s <My k _ lt i k ] and 

(2) for each 1 < k < n, there is a stack Sk with top 2 (sfc) = pop^u;) such that Mu k +i,j k ] * s 
a return of Sk ■ 

Corollary 4.36 (cf. Corollary 14. 27p . Rt(pop 1 (s)) ; Sym(s) ; and CLvl(s) determine hLp(s) . 

We conclude the proof of Proposition 14.291 by showing that lLp(s) is determined by 
Rt(s), Sym(s) and CLvl(s). 

Lemma 4.37. Let p be some run. p is a 1-loop from some stack s to some stack s' 
with top 2 (s) = top 2 (s') if and only if p is a run from some stack s to some stack s' with 
top 2 (s) = top 2 (s') and p decomposes as 

p = A o pi o Xi o p 2 o ■ ■ ■ o p n o X n 
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where s < Aj and eac/i is a return of a stack Si with top 2 (sj) = top 2 (s). 

Proof. First assume that p is a 1-loop. If s <] p, set n = 0. Otherwise, let j be minimal 
such that s^p{j + 1). Set Ao := p\[oj]- Note that Ao is s prefixed. Since p does not visit 
substacks of pop 2 (s), top 2 (p(i+l)) = pop 1 (top 2 (s)). This implies that top 2 (,o(j) ) = top 2 (s). 
By definition of a 1-loop, there is some k > j + 1 such that pty+i k] 1S a return. It follows 
directly that p\ := p\um ls a return of the stack s\ := p(j) and top 2 (si) = top 2 (s). 

The first direction of the lemma follows by iterating this construction. 

Now assume that p is a run from s to s' with top 2 (s') = top 2 (s) that decomposes as 
specified above. We show that p is a 1-loop. p cannot visit a stack t with \t\ < \s\ because 
then it especially visits such a stack at \j(0) for some 1 < j < n which contradicts s < Aj(0). 
Moreover, if it visits some stack t with top 2 (t) = top 2 (pop 1 (s)) then t occurs within some 
return pj before the final configuration of pj. Assume that this position is k, i.e., the stack 
at Pj{k) is t. Since pj is a return, there is a minimal k' such that the stack at Pj(k') is 
narrower than t. Since \s\ < \pj(k')\ < \t\, we conclude by Lemma [4.141 that p\[k.k'] 1S a 
return. □ 

Corollary 4.38. lLp(s) is determined byHt(s), Sym(s) and CLvl(s). 

Proof. By definition, it suffices to consider stacks of width 1. Thus, let w and w' be words 
with Sym(u;) = Sym(u/), CLvl(u>) = CLvl(u/) and Rt(w) = Rt(w'). Set s = [w] and 
s' = [w'\. If s < p, then s' < p[s/s']. Moreover if p' is a return from (q, s : w) to (q',s) for 
some s € Stck(S) then there is a return from (q, s' : w') to (q',s') for all s' G Stck(S). 
Using the decomposition from the previous lemma, we can apply stack replacement and the 
existence of similar returns in order to show that lLp(ti;) = lLp(w'). □ 

Proposition ^. 29l now follows from Corollaries 14.361 14.341[4~32|l4.38l and from Proposition 
14.201 we can store Rt(pop 1 (?/;)), Lp(pop 1 (w)), Sym(pop 1 (w)), Sym(^) and CLvl(w) in 
2 . (2l ( 3 x( 3l) 2 • |S| 2 many states and update the information during a transition reading 
the next letter of some word. Of course, Sym(pop 1 (?x;)) and Sym(u;) are only defined 
for words of length at least 2. Note that we are only interested in words occurring as 
topmost words of stacks. In such words, the combination Sym(ii;) = _L and Sym(pop 1 (u;)) G 
X does never occur because _L is the bottom of stack symbol. Thus, some states from 
2<9xQ x 2<9xQ x vj x {_L} x {1,2} can be used to deal with the cases of words of length at 
most 1 separately. 

5. Regularity of the Reachability Predicate via Enc 

Using the decomposition and automaticity results from the previous Section, we show that 
for each CPS S the encoding Enc translates the relation Reach (on all possible configura- 
tions, not only those occurring in the graph) into an automatic relation. Using the closure 
of CPS under products with finite automata, we then extend this result to all reachability 
relations Reach^ where L is some regular language over the transition labels. 

5.1. Connection between Milestones and Enc. We want to show the regularity of the 
regular reachability relations on collapsible pushdown graphs. As a preparation, we develop 
two correspondences between the nodes of the encoding of some stack s and the (generalised) 
milestones MS(s) (GMS(s), respectively). Taking a node d £ Enc(s) to the stack encoded 
by Enc(s) |~{ e:e<lex(i | is an order isomorphism between (Enc(s), <i ex ) an d (MS(s),<?C). We 
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denote the image of a node d under this isomorphism by LStck(d, s). After the discussion 
of this isomorphism we develop another correspondence between nodes of Enc(s) and the 
generalised milestones of s. For each d G Enc(s), we define the induced general milestone 
IgM(d,s) G GMS(s). This is the «-maximal m G GMS(s) such that LStck(d, s) < m. 
Apparently, if LStck(d, s) < s, then the induced general milestone is s. This occurs if and 
only if d is in the rightmost path of Enc(s). In all other cases IgM(d, s) is the <C-maximal 
generalised milestone of s whose topmost word is a copy of the topmost word of LStck(<i, s). 
In this case top 2 (LStck(d, s)) is not a prefix of top 2 (s) and IgM(d, s) is not a milestone. 

LStck and IgM are useful concepts for the analysis of runs from some milestone m of s 
to s due to the fact that any generalised milestone of s occurs in the image of LStck or in 
the image of IgM. Furthermore, the generalised milestones associated to some node d are 
closely connected to those associated to its successors. Assume that d, dO, dl G Enc(s). Then 
LStck(cZ0, s) is the ^-successor of LStck(d, s), LStek(dl, s) is the <C-successor of IgM(dO, s) 
and IgM(d, s) = IgM(dl, s). If m <C LStck(d, s), Corollary 14.101 implies that a run from m 
to s which does not visit substacks of m visits LStck(d, s), LStck(dO, s), IgM(dO, s), etc. It 
also implies that <C-successors are connected by one operation followed by some loop. 

We will later show that the combination of these observations is the key to the regularity 
of the reachability relations. A finite automaton may guess at each node d the last states 
in which LStck(d, s) and IgM(d, s) are visited by some run from some milestone m to s. 
Since the direct <C-successors of these stacks are encoded in the successor or predecessor 
of d, the automaton can check that these guesses are locally consistent, i.e., that there is a 
single transitions followed by a loop connecting (</i,si) to (92, S2) where s± and s 2 are the 
generalised milestones represented by d and its successor (or predecessor). If the guess of 
the automaton is locally consistent at all nodes, it witnesses the existence of a run from m 
to s. 

Definition 5.1. Let T G T Enc be a tree and d G T \ {e}. Then the left and down- 
ward closed tree of d is LT(d,T) := T\ D where D := {d' G T : d' <i ex d}. We denote by 
LStck(<i, T) := 7T2 (Dec (LT(d, T))) the left stack induced by d. iT2 denotes the projection to 
the stack of Dec(LT(d,T)). If T is clear from the context, we omit it. 

Remark 5.2. We exclude the case d = e from the definition because the root encodes the 
state of the configuration and not a part of the stack. In order to simplify notation, we use 
the following conventions. Let c = (q,s) be a configuration. For arbitrary d G {0,1}*, we 
set LStck(d, s) := LStck(0d, c) := LStck(0d, Enc(c)). 

Recall that w := top 2 (LStck (d, is top 2 (LStck(d, s)) where all level 2 links are set 
to 0. Due to the definition of the encoding, for every d G Enc(s), w is determined by the 
path from the root to d: interpreting e as empty word, the word along this path contains 
the pairs of stack symbols and collapse levels of the letters of top 2 (LStck(cf, s)). Since all 
level 2 links in w are 0, w is determined by this path. Thus, Proposition 14.291 implies that 
there is an automaton that calculates at each position d G Enc(q, s) the existence of loops 
of LStck(d, Enc(g, s)) with given initial and final state. 

LStck(c2, Enc(g, s)) is a substack of s for all d G Enc(q, s). This observation follows from 
Remark 13.41 combined with the fact that the left stack is induced by a lexicographically 
downward closed subset. 

Lemma 5.3. Let s G Stck(S). For each d G Enc(s) we have LStck(d, s) G MS(s). Further- 
more, for each s' G MS(s) there is some d G Enc(s) such that s' = LStck(d, s). 
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Proof. For the first claim, let d € Enc(s). We know that Sd ■= LStck(d, s) is a substack of 
s. Recall that the path from the root to encodes top 2 (sd). Furthermore, by definition 
of Enc, d corresponds to some maximal block b occurring in s in the following sense: there 
are 2- words si, s 2 and a word w such that s = s± : (w \ b) : s 2 and such that the subtree 
rooted at d encodes b. Moreover, d encodes the first letter of b, i.e., if b is a r-block, then 
the path from the root to d encodes wt. 

Note that by maximality of b, the greatest common prefix of the last word of s\ and the 
first word of w \ b is a prefix of wt. Since the elements that are lexicographically smaller 
than d encode the blocks to the left of b, one sees that Sd = s\ : wt. Setting k := \sd\, we 
conclude that Sd is a substack of s such that the greatest common prefix of the (k — l)-st 
and the k-th word of s is a prefix of top 2 (sd). Recall that this matches exactly the definition 
of a milestone of s. Thus, Sd is a milestone of s and we completed the proof of the first 
claim. 

Now we turn to the second claim. The fact that every milestone s' £ MS(s) is indeed 
represented by some node of Enc(s) can be seen by induction on the block structure of 
s'. Assume that s' <G MS(s) and that s' decomposes as s' = 60 : &i : ■ ■ ■ : b m -i : b' m into 
maximal blocks. We claim that s then decomposes as s = &o : b± : • • • : fe m _i : b m : ■ ■ ■ : b n 
into maximal blocks. In order to verify this claim, we have to prove that b m -i cannot be 
the initial segment of a larger block b m -i : b m in s. Note that if b' m only contains one letter, 
then by definition of a milestone the last word of 6 m _i and the first word occurring in s 
after b m -i, which is the first word of b m , can only have a common prefix of length at most 
1. Hence, their composition does not form a block. Otherwise, the first word of b' m contains 
two letters which do not coincide with the first two letters of the words in b m -i- Since this 
word is by definition a prefix of the first word in b m , we can conclude again that 6 m _i : b m 
does not form a block. 

Note that all words in the blocks bi for 1 < i < n and in the block b' m share the same 
first letter which is encoded at the position e in Enc(s) and in Enc(s'). By the definition of 
Enc(s) the blockline induced by bi is encoded in the subtree rooted at 1*0 in Enc(s). For 
i < m the same holds in Enc(s'). We set d := l m . Note that Enc(V) and Enc(s) coincide 
on all elements that are lexicographically smaller than d (because these elements encode 
the blocks b± : 6 2 : . . . b m -\. 

Now, we distinguish the following cases. 

(1) Assume that b' m = [r] for r € S U (S x {2} x N). Then the block b' m consists of only 
one letter. In this case d is the lexicographically largest element of Enc(s') whence 
s' = LStck(d,Enc(s')) = LStck(d, Enc(s)). 

(2) Otherwise, there is a r € £ U (S x {2} x N) such that 

b m = t \ (cq : Ci : ■ ■ ■ : c m /_i : c m > : ■ ■ ■ : c n >) and 

b'm = t \ (cq : ci : • • • : c m '_i : c' m ,) 

for some m! < n' such that cq : c\ : ■ ■ ■ : c n > are the maximal blocks of the blockline 
induced by b m and cq : c\ : . . . c m >-\ '■ c' m , are the maximal blocks of the blockline 
induced by b' m . Now, C\ : c 2 : ■ ■ ■ : c m /_i are encoded in the subtrees rooted at dOVO 
for < i < m! — 1 in Enc(s) as well as in Enc(s'). c m '+i ■ c m '+2 ■ • • • : Cn' 
is encoded in the subtree rooted at d01 m +1 in Enc(s) and these elements are all 
lexicographically larger than d0l m 0. Hence, we can set d' := d01 m and repeat this 
case distinction on d',c ' , and c m > instead of d,b' m and b m - 
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Since s' is finite, by repeated application of the case distinction, we will eventually end up 
in the first case where we find a d G Enc(s) such that s' = LStck(d, Enc(s)). □ 

The next lemma states the tight connection between milestones of a stack (with substack 
relation) and elements in the encoding of this stack (with lexicographic order). 

Lemma 5.4. LStck(-,s) is an isomorphism between (dom(Enc(s)), <i ex ) and (MS(s),^;). 

Proof. If the successor of d in lexicographic order is dO, then the left stack of the latter 
extends the former by just one letter. Otherwise, the left and downward closed tree of the 
successor of d contains more elements ending in 1, whence it encodes a stack of larger width. 
Since each left and downward closed tree induces a milestone, it follows that g is an order 
isomorphism. □ 

Recall that by Corollary 14.101 each run to a configuration (q, s) visits the milestones of 
s in the order given by the substack relation. With the previous lemma, this translates into 
the fact that the left stacks induced by the elements of Enc(g, s) are visited by the run in 
lexicographical order of the elements of Enc(g, s). 

Beside the tight correspondence of milestones and nodes of the encoding, there is an- 
other correspondence between generalised milestones and nodes. Using both correspon- 
dences, each generalised milestone is represented by a node of the encoding. The following 
definition describes the second correspondence. Recall that < denotes the prefix relation 
on trees. 

Definition 5.5. Let T G T Enc be the encoding of a configuration. Let d G T \ {e}. Set 
D := LT(d, T) U {d' G T : d < d'}. Let exLStck(d,T) := 7r 2 (Dec(T[ D )) where vr 2 (c) is the 
projection to the stack of the configuration c. 

By case distinction on the rightmost branch of T, we define the generalised milestone 
induced by d as follows. 

(1) If d is in the rightmost branch of T, then IgM(<i, T) := exLStck(d, T) = 7T2(Dec(T)), 

(2) otherwise, set IgM(d,T) := exLStck(d,T) : top 2 (LStck(d, T)). 

Remark 5.6. As the name indicates, IgM(d, T) is a always a generalised milestone of 
Dec(T). For some d in the rightmost branch of T this holds trivially. For all d G T that are 
not in the rightmost branch, note the following: 

• If dl G T, then IgM(d,T) = IgM(dl,T). 

• If d is a leaf, then IgM(d,T) = clone2(LStck(d,T)). Since LStck(d, T) is the maximal 
milestone of Dec(T) of width |LStck(d, T)|, IgM(d, T) is a generalised milestone of Dec(T) 
(since d is not in the rightmost branch, |Dec(T)| > |LStck(d, T)\). 

• If d, e G T such that e = dO and dl £ T, then IgM(d, T) = pop 1 (IgM(e, T)). 

• If d,e G T such that e = dO and dl G T, then LStck(dl,T) = pop 1 (IgM(e, T)). 

By induction from the leaves to the inner nodes one concludes that IgM(d, T) is a generalised 
milestone of Dec(T) for each d G T. One also shows that, for each generalised milestone 
m of Dec(T) that is not a milestone, there is some d such that m = IgM(d, T) as follows. 
For s = w\ : W2 w n , let Wi k be the k-th. word occurring in s such that Wi k is not 

a prefix of Wi k+ \. Then w\ : W2 : • • • : Wi k -i ■ Wi k = LStck(d, T) for the k-th. leaf d and 
IgM(d, T) = w\ : u>2 ■ ■ ■ ■ ■ Wi k ~i '■ Wi k '■ Wi k is the fe-th generalised milestone of the form 
u>i : u>2 : • • • : Wj-i '■ Wj '■ Wj for some j < n which is not a milestone. Now one can show the 
following. If IgM(d, T) is not a milestone (but a generalised one) and pop 1 (IgM(d, T)) is 
not a milestone, then pop 1 (IgM(d, T)) is a generalised milestone, there is a node e such that 
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d = e01 k for some k € N, and pop 1 (IgM(d, T)) = IgM(e,T). Apparently, every generalised 
milestone that is not a milestone is of the form pop 1 fc (wi : W2 '■ ■ ■ ■ '■ Wi-i : Wi : Wi) for some 
i < n. Thus, this proves the claim. 

5.2. Tree-Automaticity of Reachability. In this section we show that the reachability 
relation Reach is automatic via Enc. In the next section we extend this result to the regular 
reachability predicates. Recall that due to Remark 14.31 a proof of the regularity of the 
relations , R^, and R^ implies the regularity of Reach. 

5.2.1. Regularity of the Relation R^ . Recall that a pair of configurations (ci,C2) is in R^ 
if and only if they are connected by a run p that decomposes as explained in Lemma 
14.171 In the Appendix [B] we construct an automaton Ar<= recognising R^ based on the 
following idea. Ar<= guesses the decomposition according to Lemma 14.171 and identifies 
a node representing the stack reached after each part of the decomposition. Ar<= labels 
this node by the initial and final state of the segment of the decomposition starting at the 
corresponding stack and checks whether the labelling of all the representatives fit together. 
The fact that Ar<= can check the correctness of its guess relies heavily on the computability 
of the returns and 1-loops of the stack LStck(<i, Enc(g, s)) along the path from the root of 
Enc(g,s) to d. We next explain how Ar<= processes two configurations c\ = (q\,s\) and 
c 2 = (<12, S2)- Let us assume that S2 = pop 2 fc (si) and let p be a run from c\ to C2 witnessing 
(ci, c 2 ) € R^ . Let us first assume that p is a sequence of returns p = p\ o • • • o p k . Due 
to the special form of S2, Enc(si) and Enc(s2) agree on the domain of Enc(s2). Moreover 
Enc(si) extends Enc(s2) by k paths to nodes d\, . . . ,dk such that each di does not have a 
O-successor. Moreover, all di are lexicographically larger than all nodes in Enc(s2)- There 
is a close correspondence between the pi and the df LStck(dj, Enc(si)) = pop2 fc_4 (si) and 
Pi starts in LStck(c4+i-«, Enc(si)) and ends in LStck(c4-«, Enc(si)) (where we set do to 
be the rightmost leaf of Enc(s2)). Ar<= guesses the existence of the run p as follows: at 
first, it checks that Enc(si) and Enc(s2) agree on the domain of Enc(s2). Secondly, along 
the common prefix it propagates the initial and final state of p, i.e., the states q± an q2 
and it computes at each node d the possible returns at the corresponding milestone. Now 
assume that at some node e there starts a left and a right branch such that the left branch 
is a prefix of d%, . . . , di and the right branch is a prefix of dj+i, . . . , dk- At this position 
the automaton guesses that LStck(cZj + i, Enc(ci)) and LStck(dj, Enc(ci)) are connected via 
a return and guesses the initial and final state qi,q e - Now the automaton propagates along 
the left branch starting at e the state information q e and 92 (trying to find a run from 
(q e , LStck(dj, Enc(ci)) to C2) and along the right branch the information q\, (R, qi, q e ) (trying 
to find a run from c\ to c' = (qi, LStck(cZj+i, Enc(ci))) such that there is a return starting 
in c' and ending in state q e ). Doing the same at each splitting points of the prefixes of the 
di, . . . , dk, the automaton finally reaches each node di with a tuple (R, qi, q'j) of states such 
that it has to check whether there is a return starting in (qi, LStck(di, Enc(ci))) and ending 
in state q[. But since the returns of LStck(dj, Enc(ci)) are computable with an automaton 
reading the path to di, this can be easily checked with a tree-automaton. 

The case that p decomposes as a sequence of returns is the easiest one because all parts 
of the decomposition then start and end in stacks that are milestones of c\. Now assume 
that in the decomposition of p according to Lemma f4. 171 1-loops followed by popj^ or collapse 
operations occur. For simplicity of the explanation assume that p = X\ o A2 A3 such that 
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Ai and A3 decompose as sequences of returns and A2 = pi a • • • o p n decomposes such that 
Pi is a sequence of 1-loops followed by pop x and collapse where only p n ends in a collapse 
of level 2. Using the notation from the previous case, we find d{ and dj (i < j) among 
di,...,dk such that A2 starts in stack LStck(dj, Enc(ci)) and ends in LStck(dj, Enc(ci)). 
We would like to treat this case similar to the return case, but note that the final stacks 
of pi, . . . , p n -i do not necessarily appear as milestones of s±: in general, these stack can 
be wider than s\\ The key observation that allows to represent these stack by certain 
milestones is the following: let io := LStck(<ij, Enc(ci)) be the stack in which p\ starts. 
By definition of a 1-loop followed by a popj^ operation, the final stack of p\ is some stack 
t\ such that top 2 (ti) = top 2 (pop 1 (to))- In particular, each level 2 collapse link in top 2 (ii) 
points to the same substack as the corresponding element of top 2 (io)- If e i is the unique 
ancestor of dj such that dj = ei01 x for some x £ N, one sees that t\ and LStck(ei, Enc(ci)) 
agree on their topmost words including the targets of their level 2 collapse links. Note 
that A 2 (modulo widening the stack during 1-loops) basically performs pop} /collapse of 
level 1 on top 2 (io) an d finally a collapse of level 2 on a prefix of top 2 (£o)- Thus, with 
respect to the stack operations induced by the run p 2 o ••• o p n , mi := LStck(ei, Enc(ci)) 
and ti agree and e\ can serve as representative of t\. Iterating this argument, we find 
nodes e 2 ,...,e n _i such that m x := LStck(e x , Enc(ci)) agrees with the final stack t x of 
p x on the topmost word including the targets of all level 2 collapse links for all x < n. 
Moreover, collapse(i n _i) = collapse (m n _i) = LStck(dj, Enc(ei)). The automaton Ar<= 
uses this observation as follows. Processing the encodings of c\ and c 2 from the root to the 
leaves it arrives at the greatest common prefix / of di and dj with a guess (qi,q e ) of the 
initial and final state of the subrun of p which decomposes as A' x o A 2 o A 3 where A 2 is defined 
as before and A' x is a suffix of Ai and A3 a prefix of A3. Now the automaton guesses the 
initial and final state (91,(72) of the run A 2 and propagates along the left branch the guess 
that A 3 is a run from q 2 to q e and along the right branch a guess of the form (C, 91,52) 
meaning that the last part of A^ o A 2 is a 1-loop followed by collapse (hence the "C" ) from 
state q\ to state q 2 . It then nondeterministically guesses the path to dj and updates a guess 
(X, qi, q 2 ) with X G {C, P} at node e x as follows. From e x (x G N) on it propagates a guess 
(P, qi,q' 2 ) towards dj meaning that the part of A 2 connecting the corresponding nodes of 
e x -\ and e x is a 1-loop followed by a pop x /collapse of level 1 ("P" for pop) from state q\ to 
state q' 2 - It verifies the compatibility of the guess (X,qi,q 2 ) at e x and the guess (P, qi,q' 2 ) 
by checking that for any stack with the same topmost word as m x = LStck(e x , Enc(ci)) 
there is a 1-loop followed by an operation induced by X from state q' 2 to state q\ (induced 
operation means collapse of level 2 if X = C and pop x or collapse of level 1 if X = P). This 
way the automaton reaches dj with a guess (X, qi,q 2 ) and needs to verify that there is a 
1-loop followed by an operation induced by X starting in (gi, to) and ending in q 2 . Since an 
automaton can keep track of the possible 1-loops at each node of the tree this is possible. 
The automaton can guess states and successfully verify its assumptions if and only if there 
is a run from LStck(cZj, Enc(ci)) to LStck(dj, Enc(ci)) with initial and final state as guessed 
at the node / that decomposes into 1-loops followed by one pop x or collapse operation each. 
Combining this idea with the verification of guesses on parts of the run p that are returns, 
the tree-automaton accepts the encodings of two configurations if and only if this pair of 
configurations is in R^. 

In Appendix [B] we show that the described automaton works correctly and can be 
implemented with exponentially many states in the number of states of the collapsible 
pushdown system. Thus, we obtain the following result. 
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Lemma 5.7. There are two polynomials pi,P2 such that the following holds. Let S be a 
level 2 collapsible pushdown system with stack alphabet £ and state space Q. There is an 
automaton with p\ ( | £ | ) ■ exp(j> 2 (|(5|)) many states that accepts the convolution of two trees 
if and only if this convolution is of the form Enc(ci) ® Enc(c2) for ci, c 2 configurations with 
(ci,c 2 ) E R*=. 

5.2.2. Regularity of the Relation R^. Recall that the relation R^ from Definition 14.21 con- 
tains pairs (ci,c 2 ) if c 2 = pop 1 m (ci) and there is some run from c\ to c 2 not visiting any 
substack of c 2 before its final configuration. A simple induction on the blocks in c\ and c 2 
yields the following characterisation of Enc(ci) (g> Enc(c 2 ). For d E {0, 1}*, |d|o denotes the 
number of 0's in d. 

Lemma 5.8. Let si,s 2 be stacks and d the rightmost leaf o/Enc(s 2 ). s 2 = pop 1 m (si) for 
some m E N if and only i/Enc(si) ® Enc(s 2 ) is of one of the following forms. 

(1) Lf d E Enc(si), then dom(Enc(si)) = dom(Enc(s 2 )) U {d0 k : k < m} and Enc(si) and 
Enc(s 2 ) agree on dom(Enc(s 2 )). 

(2) If d ^ Enc(si), then d E {0, 1}*1. Let c E {0, 1}* be the predecessor of d. There is some 
e E {0, 1}* such that ce is the rightmost leaf o/Enc(si). Then \e\o = m, 

c?om(Enc(si)) = (rfom(Enc(s 2 )) U {x : c < x < e}) \ {d} 

and Enc(si) and Enc(s 2 ) agree on dom(Enc(s 2 )) \ {d}. Moreover, there exists some 
f E {0,1}* 1 such that 

• ce = f0 k for some k < m and 

• dom(Enc(si)) \ (fom(Enc(s 2 )) = {/ < x < ce}. 

In Remark 15.21 we pointed out that the path from the root to the rightmost leaf of 
Enc(si) encodes top 2 (si). If the first case of the characterisation applies, then the k-th 
predecessor Xk of the rightmost leaf of Enc(si) satisfies LStck(x/c, s\) = pop 1 fc (si) for all 
k < in. If the second case applies, for all c < x < e with |x|o — |c|o = k < m, the path to x 
encodes Wk ■= top 2 (pop 1 m_fc (si)) and top 2 (LStck(x, si)) = Wk- Thus, the elements on the 
path from d (or c, respectively) to the rightmost leaf of Enc(si) may serve as representatives 
of the stacks that a run from s% to s 2 passes. 

Recall the decomposition into high loops of witnessing runs for (ci,c 2 ) E Ry from 
Lemma 14.181 We describe informally the automaton that recognises the relation Ry. 
Ari} guesses the path to the rightmost leaf of Enc(c 2 ) and keeps track of hLp(<i) at each node 
d on this path. Each node d on the path from the rightmost leaf of Enc(c 2 ) to the rightmost 
leaf of Enc(ci) is labelled by the state ofci, by Rt(pop 1 (LStck(d, ci))), by Sym(LStck(<i, ci)), 
by CLvl(LStck(d, ci)) and by a guess qd E Q of a final state of some run from Enc(ci) to 
pop 1 m_fe (si) for k appropriate such that top 2 (LStck(d, c\)) = top 2 (pop 1 m_fc (ci)). Recall 
that such a state determines the set hLp(LStck((i, ci)). The automaton can verify that the 
guesses of the qd are consistent in the following sense. If it has labelled some node d with a 
state qd such that there is a run from c\ to (qd, pop 1 m_fc (si)), then there is also a run from 
ci to (q c , pop 1 m ~ fe_ ^ 1_l - ) (si)) for c the node such that ci = d. If i = 1, q c = qd and if i = 
then the run to s' := pop 1 m_fe_ ( 1_ *)(si) is extended by a high loop of s' followed by a ~popi 
or collapse of level 1. Since the automaton "knows" the possible high loops, the topmost 
symbol and the link level of the stack, this check is trivial. Since it also stores the state 
of c\, A.rh. can verify that its guess at the rightmost leaf of Enc(ci) is a state q such that 
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there is a high loop starting in c\ and ending in state q. We postpone the formal definition 
of Ar$ and the proof of the following lemma to Appendix ICl 

Lemma 5.9. There are polynomials pi,P2 such that the following holds. Let S be a level 2 
collapsible pushdown system with stack alphabet E and state space Q. There is an automaton 
with pi(|E|) -exp(p2(|<3|)) many states that accepts the convolution of two trees if and only if 
this convolution is of the form Enc(ci) (g>Enc(c2) for c±,C2 configurations with (01,02) G ■ 

5.2.3. Regularity of the Relation Rft. Recall that the relation R^ is in some sense the back- 
ward version of the relation R^. (c\ , 02) G R^ holds if there is a sequence of high loops, pop 1 
and collapse of level 1 connecting c\ with c%. Analogously, (01,02) G R^ holds if there is a 
sequence of high loops and push CTZ operations that generates C2 from c\. Since (01,02) G 
implies that ci = pop 1 fc (c2) for some k G N, Lemma loTHl applies analogously. There is just 
one further condition: if (c\, 02) G R^ and top 2 (ci) < top 2 (c2) n top 2 (pop 2 (c2)), then there 
is some word w such that top 2 (c2) ntop 2 (pop 2 (c2)) = top 2 (ci)w; and w only contains links of 
level 1 (otherwise, the stack of 02 cannot be generated from the stack of ci without passing 
pop 2 (ci)). 

The automaton Ar-$\ recognising the relation R^ via Enc does the following. Due to 
Lemma [5 .81 the path from the rightmost leaf of Enc(ci) to the rightmost leaf of Enc(c2) has 
the following form: For each |top 2 (ci)[ < m < |top 2 (c2)| it contains nodes d,dl, . . . ,dl km 
with |cZ|o = m such that top 2 (LStck(d, Enc(c2)) is the prefix of top 2 (c2) of length m. Recall 
that A.rh. tries to label d with a state q e and e = dl km with a state q' e such that q' e and q e 
are connected by a high loop of LStck(e, Enc(ci)) plus a pop x or collapse of level 1. Since 
A R t works in the other direction, it labels d with a state qi and dl km with a state q[ 
such that qi and q\ are connected by a high loop of LStck(d, Enc(c2)) followed by a push^. 
Furthermore, it checks that I = 1 as long as the path to d encodes a proper prefix of 
top 2 (c2) PI top 2 (pop 2 (c2)) which is not a prefix of top 2 (ci). With these remarks, the formal 
construction of Ark from A R ^ (cf. Appendix [C]) is left to the reader. 

Lemma 5.10. There are two polynomials pi,P2 such that the following holds. Let S be a 
level 2 collapsible pushdown system with stack alphabet E and state space Q. There is an 
automaton with pi(\T,\) ■ exp(p 2 (|Ql)) many states that accepts the convolution of two trees 
if and only if this convolution is of the form Enc(ci) <g> Enc(c2) for c±,C2 configurations with 
(ci,c 2 ) G Rfi. 

5.2.4. Regularity of the Relation R^ . Given a CPS S = (Q, E, T, As, qo), we define an 
automaton Ar^ that recognises the relation Rf* in the following sense. Given configurations 
ci = (qi, si) and C2 = (q2, S2), Ar^ accepts Enc(ci) ® Enc(c2) if and only if s\ = pop 2 fc (s2) 
for some k G N and there is a run p of S from ci to C2 witnessing (01,02) G R^~. 

We informally explain how Ar^ processes the encoding Enc(ci) <S> Enc(c2) of two con- 
figurations ci,c 2 in order to verify (01,02) G R^. First of all the automaton guarantees 
that s\ = pop 2 fc (s2) for some k G N (this is the case if and only if Enc(si) is a subtree of 
Enc(s2), the rightmost leaf I of Enc(si) does not have a O-successor in Enc(s2) and for each 
d <icx I, d G Enc(s2) <f> d G Enc(si)). 

Assume that ci = {qi,s\) and 02 = ((72,52) with s\ = pop 2 fe (s2). If there is a run from 
ci to C2 its form is described in Corollary 14. 101 it is a sequence of loops followed by one oper- 
ation each that starts in some generalised milestone of S2 and leads to the next generalised 
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milestone (with respect to <C). Recall the following: each node in Enc(s2) which is not 
contained in Enc(si) corresponds to a milestone in MS(s2) \ MS(si) via LStck(d, Enc(s2))- 
Moreover, each node in the rightmost branch of Enc(si) or in Enc(s2) \Enc(si) corresponds 
to a generalised milestone in GMS(s2) \ GMS(si) via IgM. The essence of Remark 15.61 is 
that for each generalised milestones represented by a node d the node representing the <C- 
successor of this generalised milestone can be found locally around d. Since we can compute 
the possible loops of the stack LStek(d, Enc(c2)) and IgM(c£, Enc(c2)) along the path from 
the root to d, a tree-automaton may guess the initial and final states of each part of the 
decomposition of a run according to Corollary 14. 101 and check the local compatibility of each 
of the guesses. 

The detailed definition of Ar^- as well as a proof of the following lemma can be found 
in appendix ID1 

Lemma 5.11. There are two polynomials pi,p% such that the following holds. Let S be a 
level 2 collapsible pushdown system with stack alphabet £ and state space Q. There is an 
automaton with p\(\T,\) ■ exp(p2(|Q|)) many states that accepts the convolution of two trees 
if and only if this convolution is of the form Enc(ci) ® Enc(c2) for ci,c% configurations with 
(ci,c 2 ) G if*. 

5.3. Regularity of Reach^. In this part, we use the closure of collapsible pushdown sys- 
tems under products with finite automata in order to provide a proof of the automaticity 
of all regular reachability predicates (Proposition I3.9H : we reduce regular reachability to 
reachability in a product of the collapsible pushdown system with the automaton for the 
regular language. 

Recall that for L C T* some (string-) language, Reach ^ is the binary relation that 
contains configurations (c, c) if and only if there is a run p from c to c such that the labels 
of the transitions used in p form a word w £ L. 

Let L be some regular language and Al an automaton recognising L. We construct the 
product S x Al- Reach^ on CPG(5) is expressible via the relation Reach on CPG(<S x Al)- 
As a corollary of this result, we obtain the tree- automaticity of Reach^. 

Definition 5.12. Let S = (Q,S,r,%, A) be a 2-CPS and let A L = (Q L , T, i , F, A L ) be a 
finite word-automaton. We define the product of S and Al to be the collapsible pushdown 
system 

S x Al ■= (Q x Q L , S, T, (qi,i ), A) where 

A := {((q,qi),<r,y, (<?',<?[), op) : (q, a, 7, q , op) G A and (qi,"f,q'i) G A L }. 

A straightforward induction shows that there is a run of S x Al from ((g, io),s) to 
(((/, qf), s') for q, q' G Q, and qj G F if and only if there is a run of S from (q, s) to (q', s') 
such that the labels of the run form a word in L. Since Reach is a tree-automatic relation, 
we obtain the proof of Proposition 13.91 

Proof of Proposition \3.9l Recall Remark l4.31 It says that there is a positive existential first- 
order formula defining Reach in terms of R^,R^,R^ and if*. Due to Lemmas 13 .7^ 15.71 
15.91 15.101 and 15. 11\ there are polynomials p and p' such that there is a (nondeterministic) 
tree-automaton A corresponding to Reach on S x Al with • exp(p'(|Q| • \P\)) many 

states. We obtain that for states q,q' and stacks s, s' there is a final state qj G F of Al 
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such that A accepts (Enc((g, iq), s), Enc(((/, qf), s')) if and only if ((g, s), (q' , s')) £ Reach^ 
holds in S. 

This almost completes the proof. We only have to modify A in such a way that it 
guesses qf and treats the configuration (q,s) as if it was ((q,io),s). This can easily be 
done without increasing the number of states of the automaton because the states of the 
configurations are encoded in the roots of the trees. We explain in Appendix [E] the detailed 
modification. □ 

6. Conclusion 

We have shown that level 2 collapsible pushdown graphs are uniformly tree-automatic. 
Thus, their first-order theories are decidable with nonelementary complexity. Moreover, 
even first-order extended by regular reachability is decidable because of the automaticity 
of the regular reachability relations. Our result is sharp in several directions. First, we 
have also shown a nonelementary lower bound for the complexity of the first-order model- 
checking problem on collapsible pushdown graphs. Furthermore, Broadbent [5] showed 
that the first-order theories of collapsible pushdown graphs are undecidable from level 3 on 
(which implies that they are not tree-automatic). 
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Appendix A. Proof of Bijectivity of Enc 

We start by explicitly constructing the inverse of Enc. This inverse is called Dec. Since 
Enc removes the collapse links of the elements in a stack, we have to restore these now. In 
order to restore the collapse links we use the following auxiliary functions for each g G N 

f g : {e} U (S x {1, 2}) -> {e} U £ U (S x {2} x N) 

which map labels of trees to 1-words of length up to 1. We set 

'a ifr = (cr, 1), 

f g (r):=Ua,2,g) ifr = (a,2), 
e if r = e. 

In the next definition g is the width of the stack decoded so far. 

Definition A.l. Let T := (S x {1,2}) U {e}. Recall that encodings of stacks are trees in 
T r . We define the function Dec : T r x N ->■ (S U (S x {2} x N))* 2 as follows. Let 

(f g (T(e)) ifdom(T) = {e}, 

f g {T{e)) \ Dec(T , 5 ) if 1 £ dom(T),0 G dom(T), 

f g (T(e)) \ (e : Bec(T 1 ,g + 1)) if $ dom(T), 1 G dom(T), 

[f 9 (T(e)) \ (Dec(T , 5 ) : Bec(T 1 ,g + G(T ))) otherwise, 

where G(To) := |Dec(To,0)| is the width of the stack encoded in To. For a tree T G T Enc , 
the decoding of T is 

Dec(T) := (T(e), Dec(T , 0)) G Q x (E U (£ x {2} x N))+ 2 . 

Remark A. 2. Obviously, for each T G T Enc , Dec(T) G Q x (S U (S x {2} x N))+ 2 . In fact, 
the image of Dec is contained in Cnf, i.e., Dec(T) = (q,s) such that s is a level 2 stack. 
The verification of this claim relies on two important observations. 

Firstly, T(0) = (JL, 1) due to condition 2 of Definition 13.51 Thus, all words in s start 
with letter JL. s is a stack if and only if the link structure of s can be created using the 
push, clone and pop 1 operations. The proof of this claim can be done by a tedious but 
straightforward induction. We only sketch the most important observations for this fact. 

Every letter a of the form (a, 2, g) occurring in s is either a clone or can be created by 
the pushg. 2 operation. We call a a clone if a occurs in s in some word waw' such that the 
word to the left of this word has wa as prefix. Note that cloned elements are those that can 
be created by use of the clone2 and pop x operations from a certain substack of s. 

If a is not a clone in this sense, then Dec creates the letter a because there is some 
(a, 2)-labelled node in T corresponding to a. Now, the important observation is that Dec 
defines a = f g ((cr, 2)) where g+1 is the width of the stack decoded from the lexicographically 
smaller nodes. Hence, the letter a occurs in the (g + l)-st word of s and points to the g-th. 
word. Such a letter a can clearly be created by a push CT 2 operation. Thus, all 2-words in the 
image of Dec can be generated by stack operations from the initial stack. A reformulation 
of this observation is that the image of Dec only contains configurations. 

Now, we prove that Dec is injective on T Enc . Afterwards, we show that Dec o Enc is 
the identity on the set of all configurations. This implies that Dec is a surjective map from 
T Enc to Cnf. Putting both facts together, we obtain that Dec is the inverse of Enc whence, 
of course, Enc is bijective. 
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Lemma A. 3. Dec is injective on T c . 

Proof. Assume that there are trees T',U' £ x Enc with Dec(T') = Dec(t7') = (q,s). Then 
by definition T'(e) = U'{e) = q. Thus, we only have to compare the subtrees rooted at 0, 
i.e., T := T'q and U := U'q. From our assumption it follows that Dec(T, 0) = Dec(C7, 0). 

Note that the roots of T and of U are both labelled by (_L, 1). The lemma follows from 
the following claim. 

Claim. Let T and U be trees such that there are T',U' E TT Enc and d £ dom(T') \ {e}, 
e G dom(U') \ {e} such that T = T'd and U = U' e . If Dec(T, m) = Dec(U,m) and either 
T(e) = U{e) = e or T{e) 6 S x {1, 2} and U{e) € £ x {1, 2}, then U = T0 

The proof is by induction on the depth of the trees U and T. If dpt(f7) = dpt(T) = 0, 
Dec(C/, m) and Dec(T, m) are uniquely determined by the label of their roots. A straight- 
forward consequence of the definition of Dec is that U(e) = T(e) whence U = T. 

Now, assume that the claim is true for all trees of depth at most k for some fixed k € N. 
Let U and T be trees of depth at most k + 1. 

We proceed by a case distinction on whether the left or right subtree of T and U are 
defined. In fact, we will later prove that Dec(T, m) = T)ec(U,m) implies that 

(1) T ^ if and only if U ^ and 

(2) Ti ^ if and only if U x ^ 0. 

We first prove that Dec(T, m) = Dec(U, m) and conditions ([T]) and ([2]) imply that U = T. 
Afterwards we show that all possible combinations that do not satisfy these conditions 
imply Dec(T, m) ^ Dec(f7, m). 

(1) Assume that U = XJ X = T = T x = 0. Then dpt(T) = dpt{U) = 0. For trees of depth 
we have already shown that Dec(C7, 0) = Dec(T, 0) implies U = T. 

(2) Assume that U = 0, U x ^ 0, T = and T x ^ 0. In this case 

Dec(C/,m) = f m {U{e)) \ (e : Dec(E7i, m + 1)) and 

Dec(T,m) = / m (T(e)) \ (e : Decern + 1)). 

Since U(e) = e if and only if T(e) = e, we can directly conclude that U{e) = T(e). 
But then Dec(T, m) = Dec(U,m) implies that Dec(Ti,m + 1) = T>ec(U x ,m + 1). Since 
dpt(Ti) < k and dpt(Lq) < k, the induction hypothesis implies that T x = U x . We 
conclude that T = U. 

(3) Assume that U + §,U X = 0, T / 0, and T x = 0. In this case, 

Dec(£7, m) = f m (U(e)) \ ~Dec(Uo,m) and 
Dec (T,m) = f m (T(e)) \ Dec (T ,m). 

Since U(e) = e if and only if T(e) = s, we conclude that U{e) = T(e) and Dec(J7o, m) = 
Dec(To,m). Since the depths of Uq and of To are at most k, the induction hypothesis 
implies Uq = Tq whence U = T. 

(4) Assume that Uq ^ 0, U x ^ 0, T ^ 0, and T x ^ 0. Then we have 

Dec(J7,m) = f m (U(e)) \ (Bec{U ,m) : Dec(t/i,m + m')) and 
Dec(T,m) = / m (T(e)) \ (Dec(T ,m) : Dec(Ti , m + m")) 
for some natural numbers m', m" > 0. 

Since a node d of a tree in T Bnc is labelled by e iff d £ {0, 1}*1, the pair of subtrees Ti and L/, inherit 
this condition for all i € {0, 1}. 
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Since U(e) = e if and only if T{e) = e this implies that the roots of U and T coincide. 
Hence, 

Dec(C/o, m) : Dec(E/i, m + m!) = Dec(To, m) : Dec(Ti, m + m") 

If Dec(Uo,m) = Dec(To,m), then the induction hypothesis yields Uq = Tq. Further- 
more, this implies Dec(E7i, m+m') = Dec(Ti, m+m") and m' = m" whence by induction 
hypothesis XJ\ = T\. In this case we conclude immediately that T = U. 

The other case is that Dec(Uo,m) ^ Dec(To,m). We conclude immediately that the 
width of Dec(Uo,m) and the width of Dec(To,m) do not coincide. We prove that this 
case contradicts the assumption that Dec(t7, m) = Dec(T, m). 

Let us assume that Dec(i7o> m) = pop 2 z (Dec(To, m)) for some z G N\ {0}. Note that 
this implies that the first word of Dec(Ui,m + m') is a word in Dec(To,m). 

Since f7(0) is a left successor in some tree belonging to T Enc , it is labelled by some 
(<t, /) G £ x {1, 2}. We make a case distinction on I. 

(a) Assume that U (0) = (er, 2) for some a £ S. Then all words in Dec(To, m) start with 
the letter (a, 2,m). Thus, the first word of Dec(Ui, m + m') must also start with 
(a, 2, m). But all collapse links of level 2 in Dec (U\, m+m') are at least m+m' > m. 
This is a contradiction. 

(b) Otherwise, U(l) = (a, 1) for some a G S. Thus, all words in Dec(To,m) start with 
the letter a. Thus, the first word of Dec(C/o, m) and the first word of Dec(£/i, m+m') 
have to start with a. But this implies U(0) = £7(10) = (a, 1). This contradicts 
the assumption that U is a proper subtree of a tree from T Enc (cf. condition [6] of 
Definition I3.5p . 

Both cases result in contradictions. Thus, it is not the fact that there is some z 6 N\{0} 
such that 

Dec([/o,m) = pop 2 z (Dec(To,m)) 

By symmetry, we obtain that there is no z G N \ {0} such that 

Dec(To,m) = pop 2 z (Dec(£/o, m)) . 

Thus, we conclude that Dec(To,m) = Dec(Uo,m) whence U = T as shown above. 
If Dec(T, m) = Dec(U,m), one of the previous cases applies because the following case 
distinction shows that all other cases for the defined or undefined subtrees of T and U 
imply Dec(T, m) ^ Dec(f7, m). 

(1) Assume that XJq = U\ = To = and T\ ^ 0. In this case, Dec(C/, m) is [e] or [t] for 
some r € S U (S x {2} x N). Furthermore, 

Bec(T,m) = f m (T(e)) \ (e : Dec(Ti, m + 1)). 

It follows that |Dec(T, m)\ > 2 > \Dec(U, m)\ = 1 whence Dec(T, m) ^ Dec(U,m). 

(2) Assume that Uq = U\ = 0, To ^ 0, and T\ = 0. In this case, Dec(C/, m) is again [e] or 
[r] for some r£SU(Ex {2} x N). Since U(e) = e if and only if T(e) = e, we conclude 
that |/ m (T(e)| = \Dec(U,m)\. Moreover, 

Dec(T,m) = f m (T(e)) \ / m (T(0)) \ s 

for some 2-word s. Since T is a subtree of a tree in T Enc , T(0) G S x {1,2}. 
Thus, / m (T(0)) G S U (S x {1,2} x N). We conclude that the length of the first 
word of Dec(T, m) is greater than the length of the first word of Decern). Thus, 
Dec(T, m) ^ Dec(U, m). 
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(3) Assume that Uq = TJ\ = 0, To ^ 0, and T\ ^ 0. Completely analogous to case CO we 
conclude that |Dec(T, m)\ > 2 > \Dec(U,m)\ = 1 whence Dec(T, m) 7^ ~Dec(U,m). 

(4) Assume that Uq = 0, U\ 7^ 0, and Tq = T\ = 0. Exchanging the roles of U and T, this 
is exactly the same as case [TJ 

(5) Assume that Uq = 0, Ui 7^ 0, To 7^ 0, and Ti = 0. Analogously to case [21 we derive 
that the length of the first word of Dec(T, m) is greater than the length of the first 
word of Dec(U, m). Thus, Dec(T, m) 7^ Dec(U,m). 

(6) Assume that Uq = 0, U\ 7^ 0, To 7^ 0, and T\ 7^ 0. Analogously to case El we derive 
that the length of the first word of Dec (T,m) is greater than the length of the first 
word of Dec(U, m). Thus, Dec(T, m) 7^ Dec(£7, m). 

(7) Assume that Uq 7^ 0, and U\ = Tq = T\ = 0. Exchanging the roles of U and T, this is 
exactly the case [2j 

(8) Assume that Uq 7^ 0, U\ = T = 0, and T\ ± 0. Exchanging the roles of U and T, this 
is exactly the case [5l 

(9) Assume that Uq / 0, U x = 0, T ^ 0, and Ti ^ 0. In this case, 

Dec(f/,m) = / m (Z7( £ )) \ Decern) 

and Dec(T,m) = f m (T(s)) \ (Dec(T ,m) : Dec(Ti, m + m')) 

for some m' € N \ {0}. Since U(s) = e if and only if T{e) = e, we conclude that 
U(e) = T(e). Now, 

Dec(C/o,m) = t \ u' 

for r = fm(U(0)) € X U (S x {2} x {m}) and some level 2-word. We distinguish the 
following cases. 

First assume that r = (a, 2, m). For all letters in T' := Dec(Ti, m + m') of collapse 
level 2, the collapse link is greater or equal to m + m'. Hence, T' does not contain a 
symbol (a, 2, m) whence Dec(C/, m) 7^ Dec(T, m). 

Otherwise, r G S. But then Dec(C/, m) = Dec(T, m) would imply that 

Dec(T ,m) =t\T' 

and Dec(Ti, m + m')=r\ T" 

for certain nonempty level 2-words T' and T" . Since T(l) = e, it follows that 
T(0) = T(10) = (r, 1) which contradicts the fact that T is a subtree of some tree 
from T Enc . 

Thus, we conclude that Dec(T, m) 7^ Dec(J7, m), 

(10) Assume that Uq 7^ 0, U± 7^ 0, and Tq = T\ = 0. Exchanging the roles of U and T, this 
is the same as caseO 

(11) Assume that Uq ^§,U x + 0, T = 0, and T x + 0. Exchanging the roles of U and T, 
this is the same as case [6l 

(12) Assume that Uq ^§,U x + 0, T 7^ 0, and T x = 0. Exchanging the roles of U and T, 
this is the same as case [H 

Hence, we have seen that Dec(T, m) = Dec(U,m) implies that each of the subtrees of T is 
defined if and only if the corresponding subtree of U is defined. Under this condition, we 
concluded that U = T. Thus, the claim holds and the lemma follows as indicated above. Q 
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Next, we prove that Dec is a surjective map from T Enc to Cnf. This is done by induction 
on the size of blocklines used to encode a stack. In this proof we use the notion of left- 
maximal blocks and good blocklines. Let 

s : (w \ (w' : 6)) : s' 

be a stack where s and s' are 2-words, w, and w' are words, and b is a r- blockH We call b 
left maximal in this stack if either b = [r] or b = tt' \ b' such that w' does not start with 
tt' for some t' G £ U (S x {2} x N). We call a blockline in some stack good, if its first block 
is left maximal. Furthermore, we call the blockline starting with the block b left maximal if 
w' does not start with r. Recall that the encoding of stacks works on left maximal blocks 
and good blocklines. 

Lemma A. 4. Dec o Enc is the identity, i.e., Dec(Enc(c)) = c, for all c G Cnf. 
Corollary A. 5. Dec : T Enc — > Cnf is surjective. 



Proof of Lemma A. 4 Let c = (q, s) be a configuration. Since Dec and Enc encode and 
decode the state of c in the root of Enc(c), it suffices to show that 

Dec(Enc(s,(_L,l)),0) = s 

for all stacks s G Stck(S). We proceed by induction on blocklines of the stack s. For this 
purpose we reformulate the lemma in the following claim. 

Claim. Let s' be some stack which decomposes as s' = s" : (w \ b) : s'" such that 
b G (S U (S x {2} x N))+ 2 is a good r-blockline for some r G £ U (S x {2} x N). Then 

(1) Dec(Enc(6, e), = b' for the unique 2-word b' such that b = r \ b' and 

(2) if b is left maximal, then Dec(Enc(6, (a, I)), \s"\) = b where a = Sym(r) and / = CLvl(r). 
Note that the conditions in the second part require that either r G S or r = (a, 2, \s"\) for 
some a G X. 

The lemma follows from the second part of the claim because every stack is a left 
maximal _L-blockline. 

We prove both claims by parallel induction on the size of b. As an abbreviation we 

set g := \s"\. We write — (— , respectively) when some equality is due to the induction 
hypothesis of the first claim (the second claim , respectively). The arguments for the first 
claim are as follows. 

• If b = [t] for r G S U (S x {2} x N), the claim is true because 

Dec(Enc(6, e), g) = Dec(e,<?) = e. 

• If there are h,b[ G (S U (S x {2} x N))* 2 such that 

b=[ T ]:h= [r] : (r \ b[) then 

Dec(Enc(6,e),5) = Dec(e (0; Enc(&i, e)) ,g) 
=/ 5 (e)\(e:Dec(Enc(6i,e), 5 + l)) 

= e \ (e : b[) =6:^ = b' . 
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• Assume that there is some r' G £ U (S x {2} x N) and some 61 G (S U (S x {2} x N))* 2 
such that 

b = T T , \b 1 . 

The assumption that b is good implies that the blockline r' \ 61 is left maximal whence 
Dec(Enc(6,e), 5 ) = Dec (e (Enc (r' \ 61, (Sym(r'), CLvl(r'))); 0> ,9) 
=f g (e) \ Dec(Enc(r' \ b l: (Sym(r'), CLvl(r')), <?)) 

\ bl = b>. 

• The last case is that 

b = r \ ((V \ 61) : b 2 ) 

for b 2 a blockline of s not starting with t'. By this we mean that b 2 7^ t'w' : b' 2 for 
any word to' and any 2- word b' 2 . Since b is good, r 1 \ b\ is a left maximal blockline. 
Furthermore, r \ 62 is a good blockline. Thus, 

Dec(Enc(6, e),g) 

=Dec (e (Enc (r' \ b h (Sym(r'), CLvl(r'))) ; Enc(r \ b 2 , e)),g) 

=f g (e) \ (Dec(Enc(r' \ b u (Sym(r'), CLvl(r'))),s) : Dec(Enc(r \ 6 2 ,e),5 + /)) , 

where 

/ = |Dec(Enc(r' \ h, (Sym(r'), CLvl(r'))), g)\ = \h\. 
From this, we obtain that 

Dec(Enc(6, e),g) 

( = } e \ ((r' \ 61) : Dec(Enc(r \ b 2 ,e),g + /)) 
=V \ &i) : ^2 = 6'- 

For the proof of the second claim, note that the calculations are basically the same, but 
f g (s) is replaced by f g (a,l). Thus, if I = 1 then f g (a,l) = a = r. For the case I = 2, 
recall that g = \s"\ whence f g (a,l) = (a, 2, \s"\). Note that CLnk(r) = \s"\ due to the left 
maximality of b. 

Thus, one proves the second case using the same calculations, but replacing e by r. □ 

From the previous lemmas, we directly obtain Lemma 13.81 i.e., we obtain that Enc is 
bijective. 



Appendix B. Automaton for Relation 

Given a CPS S, there is an automaton Ar<= that accepts the tree Enc(ci) (8> Enc(c2) for 
arbitrary configurations c\ and c 2 if and only if c 2 = pop 2 fc (ci) and there is a run p from c\ 
to c 2 such that p(J) ^ c 2 for all j < length(p), i.e., if and only if (ci,c 2 ) G R^. The states 
of Ar<= come from the set 

{!_, qi , q % , q=} U M with M := {S, R, P, C} x Q x Q x S x {1, 2} x 2 QxQ . 

Before giving a definition of Ar<= , we informally describe how Ar<= processes some tree 
T := Enc(ci) <S> Enc(c2). An accepting run on T labels d G {0, 1}* 
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CI. by J_ if d G T+; 

C2. by if d = e; it is the initial state in which the automaton reads the states of c\ and 

C2, before it processes the encodings of the stacks, 
C3. by q= if d G Enc(c2) but not in the rightmost branch of Enc(c2); this state is used 

to check equality of the parts of Enc(ci) and Enc(c2) that are left of the rightmost 

branch of Enc(c2), 

C4. by some element from {S} x Q x Q x £ x {1, 2} x 2® x ® if d is in the rightmost branch 
of Enc(c2); S stands for searching the node encoding the final configuration of the 
run. 

C5. by some element from the set {g } U ({R,P,C} x Q x Q x S x {1, 2} x 2 QxQ ) if 
d G Enc(ci) \ Enc(c 2 ). 

Those labels that come from M are used to check the existence of some run from c\ to C2 
as follows. For all d G Enc(ci), let d^ be the rightmost leaf of the subtree induced by d in 
Enc(ci). Set := LStck^^, Enc(ci)). Let q G M be the label of some node d. By iTi(q) 
we denote the projection of q to the i-th component. Depending on ni(q) we define a stack 
Sd as follows. 

(1) If 7Ti(</) = 5, set Srf to be the stack of C2- 

(2) If 7Ti(g) = i?, set 5^ := pop 2 (LStck(eZ, Enc(ci))). 

(3) If n±(q) = C, set := collapse(LStck(d, Enc(ci))). 

(4) If n±(q) = P, set Sd := pop 1 (LStck(d, Enc(ci))). 

An accepting run p of Ar^ will label some node d by q G M such that 
C6. 7r 4 (g) = Sym(LStck(d,Enc(ci))), 
C7. 7r 5 (g) = CLvl(LStck(d,Enc(ci))), and 
C8. vr 6 (g) = Rt(LStck(d,Enc(ci))). 

C9. Moreover, if 7Ti(g) / P then there is a run p from (^(q), s£) to (^(g),^^) (which 
is an infix of some run witnessing (ci, C2) G i?^). The meaning of the labels R and 
C is as follows. If ni(q) = R then p ends in pop 2 (LStck(d, Enc(ci))). If n±(q) = C 
then p ends in collapse(LStck((i, Enc(ci))) and the collapse level is 2 (moreover, p 
actually performs as the last operation a collapse on a copy of the topmost element 
of LStck(d, ci)). Thus, in both cases the run will end in the stack Sd- To be more 
precise, the run ends in and does not visit any substack of Sd before its final 
configuration. 

CIO. If TTi(q) = P then there is some stack s' with Sd < s' and top 2 (s,i) = top 2 (s') such 
that there is a run from (^(q), s£) to (^3(9), s') (which is again an infix of some run 
witnessing (01,02) G R^). 

Cll. p will label d by q$ if there is a run from c\ to C2 not passing s£ for all d < e. 

Let us fix some notation. In this section, 7 ranges over T, y ranges over (S x {1, 2}) U {e} 
and w ranges over all words of the form w = top 2 (s)io- Whenever u; is fixed, we write 
a := Sym(w) and I = CLvl(u>). Furthermore, x ranges over {(a, l),s}. The variables 
Qii Q2, q[, q'2: q' range over Q. r ranges over S \ {±} and k over {1, 2}. We use the abbre- 
viation u (q, a, q', ColPop fe ) G A" for "37 such that 

(1) (9,0-,7,g',Popi) G A or 

(2) (g, <T,7, g', collapse) G A and k = 1". 
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If uu, t and k are fixed, we write wr k for the word w6 where 9 = < ' ' 

It if k = 1. 

Definition B.l. Fix some CPS 5 = (Q, £, T, A, g ). Define := (Qa, ±, F, A A ) 

with Q A := {±, q I} g , g = } U M, £ A = ({e, □} U {£ x {1, 2}} U Q) 2 , F = {g/}. A A contains 
the following transitions. 

Tl. (g 7 ,(gi,g 2 ),(S,gi,g 2 ,i-,l,Rt(i_ 2 )),i_), 
T2. (g , (y, □), Y, Z) for y, Z G {±, g }, and 
T3. (g = , (y, y), y, Z) for Y,Z € {±, g=}, 
Fix some g := (5, gi, (72, <r, Z, Rt(u/)). We add the following transitions to A^: 
T4. (q,(x,x),±,±) if Qi_= g 2 ; 

T5. (g, (x, x), _L, qi) for g~i = (R, gi, g 2 , <J, I, Rt(w)) ; 

T6. (q,(x,x),X,q) for X e{±,q=}; 

T7. (q, (x,x),q ,±) for q = (S,q 1 ,q 2 ,T,k,Rt(wT k )); 

T8. (g, (x,x),q ,qi) for g~i = (R, q u q' 2 , a, I, Rt(w)) and g = (S,q' 2 ,q 2 ,T,k,Rt(wT k )). 
Fix some g := (J2, q±,q 2 ,a,l, Rt(w)). We add the following transitions to A^: 

T9. (g,(x,D),i_,±) if ( gi ,q 2 ) € Rt(ttf); 
T10. (g, (x, □), _L, gi) for gi = (i?, gi, g 2 , <r, Z, Rt(u/)) such that (g 2 , g2) G Rt(w); 
Til. (g, (x,D),g ,-L) for g = (i?, gi, g 2 , r, i, Rt(iwrj)) for i G {1,2}; 
T12. (g,(x,D),g ,gi) for 

gi = (R,q 1 ,q' 2 ,a,l,Rt(w)), 

go = (R,q' 2 ,q 2 ,T,i,Rt(wTi)) and 

i G {1,2}; 

T13. (g, (x,D),g ,_L) for g = (C, gi, g 2 , r, 2, Rt(wT 2 )); 

T14. (g, (x,D),g ,gi) for gi = (R, q u q' 2 , a, I, Rt(w)) and g = (C, g 2 , g 2 , r, 2, Rt(wr 2 )). 
Fix some g := (P, gi, g 2 , a, I, Rt(w)). We add the following transitions to A^: 
T15. (g, (x, □), _!_, _L) if there is a g with (gi, g) G lLp(w) and (g, a, g 2 , ColPop;) G A; 
T16. (g,(x,D),X,g) for X G {±,g }; 
T17. (g,(x,D),g ,±) for 

<?o = {P,qi,q' 2 ,T,k,Rt(wT k )), 

(g 2 ,g) G lLp(ui) and 
(g,o-,g 2 ,ColPo P/ ) G A; 

T18. (g, (£,□), g , gi) for 

gi = (i?,gi,gi,cj,Z,Rt(w;)), 

% = {P,q'i,q 2 ,T,k,Rt(wT k )), 

(q' 2 ,q) G lLp(u/) and 
(g,cr,g 2 ,ColPop / ) G A. 

Fix some g := ( C, gi, g 2 , cr, I, Rt(w)) with / = 2 (whence x ranges here over {(a, 2),e}). We 
add the following transitions to A^: 

T19. (g, (x, □), _!_, _L) if there is a g with (gi, g) G lLp(w) and (g, a, 7, g 2 , collapse) G A; 
T20. (q,(x,D),X,q) for X G {±,g }; 
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qi = (R,qi,q' 2 ,o;2,Rt(w)), 
(q' 2 ,q) G lLp(w) and 
(q,a, 7,g 2 , collapse) G A; 

9o = (P,gi,^,r,A;,Rt(w;7>.)), 

(<72><?) G ILpM and 

7, 52, collapse) G A; 

qi = {R,q 1 ,q[,a,2,Rt(w)), 

(q^i) G lLp(io) and 
(9,0", 7, 92, collapse) G A. 

Lemma B.2. If Ar<= accepts a tree Enc(ci) <8) Enc(c2) for configurations c\,c 2 , then 
c 2 = pop 2 fc (ci) and there is some run from c\ to c 2 that does not reach a substack of c 2 
before the final configuration, i.e., (ci,c 2 ) G . 

Proof. Assume that pr^ is an accepting run of Ar*= on T := Enc(ci) <S> Enc(c2). A straight- 
forward induction from the root to the leaves shows that c\ = pop 2 fc (c2), that Conditions CCD 
- Q5] hold and that the rightmost leaf of T is labelled by some element of M. Furthermore, 
PR<={0) G M and (ni(pR<= (0)), ir 2 (p R <= (0)), vr 3 (p fi ^ (0))) = (S, qi,q 2 ) for q, = (qi,Si). More- 
over, note that pR^(d) G M and Tti(pR<=(d)) = C implies CLvl(LStck(d, ci)) = 2: if a tran- 
sition at some node d labels the 0-successor dO by C, then we always have ~k^{pr^ {d0)) = 2. 
By construction, the transition of pr<= applied at dO enforces that Tvs(pR<=(dO)) is the link 
level encoded in the tree at dO. Thus, the claim holds for all successors. Moreover, a 1- 
successor is labelled by C only if its predecessor is also labelled by C. Thus, by induction on 
the distance to the first ancestor which is a 0-successor the claim holds also for 1-successors. 

By induction from the leaves to the root, we show that Conditions Cj9]and Q10I hold. 
This completes the proof, because pr<= (0) then witnesses that there is a run from c\ to c 2 
not passing a substack of c 2 before its final configuration. For the base case, assume that 
d G T is a leaf labelled by some q G M. Depending on iri{q) we have the following cases. 

• If TT\(q) = S, then d is the rightmost leaf of Enc(c2). Thus, s£ = Sd = s 2 . Since p is an 
accepting run, it uses a transition of the form T[H Thus, 7^(9) = ^3(9) and Condition CGJ] 
is trivially satisfied. 

• If %\{q) = R, p applies a transition of the form T|9l Recall that d is a leaf whence 
Sd = pop 2 (LStck(cf, ci)), s£ = LStck(<i, c\), and Tre(q) = Rt(LStck(cf, c\j). Thus, the con- 
dition in T[9]ensures that there is a run from (-7T 2 (q), s£) to (^3(5), Sd), i.e., Condition Cj9] 
holds. 

• If tti (q) = C, then p applies a transition of the form T [T9l Since the collapse level of 
is 2, we conclude analogously to the previous case that Condition Cj9] holds. 

• If 7Ti(<j) = P, the transition of pr<= at d is of the form T I151 Since s^ = LStck(c£, c\) the 
conditions of Til 51 ensure there exists some stack s' with < s' and top 2 (s') = top 2 (s^) 



T21. (g,(x,D),-L,?i) for 



T22. (q,(x,D),q ,±) for 



T23. (g,(x,D),g ,9i) for 
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such that there is a 1-loop from to s' followed by a popx operation or a collapse of 
level 1. Note that Sd < pop 1 (s') and top 2 (sd) = top 2 (pop 1 (s / )). Thus, Condition QTU1 is 
satisfied. 

A tedious but easy case distinction shows that Conditions Cj9] and C fTOl carry over to all 
nodes of T. Instead of giving the full case distinction, we mention briefly the underlying 
ideas. 

(1) If dO G Enc(ci) \ Enc(c 2 ), dl G Enc(ci) \ Enc(c 2 ) and p R ^(dO) G M, then also 
PR^(dl) G M and m (p_R<=(<il)) = R whence s<ji = s^. Thus, we can compose the 
run associated to dl with the run associated to dO and obtain a run associated to d. 

(2) If i G {0, 1} minimal such that di G Enc(ci), then either Sdi = Sd such that the final 
part of the run associated to Sdi can serve as final part of the run associated to Sd or 
Sdi = LStck(ci, c\) and the conditions on the transition at d ensure that this run can 
be extended to a run to Sd if 7Ti (pn<=(d)) ^ P. If %% (p R ^(d)) = P, this run can be 
extended to some stack s' with Sd < s' and top 2 (sd) = top 2 (s'). 

(3) If i G {0, 1} is maximal such that di G Enc(ci) then s^ = s£ and the run associated 
to di may serve as initial part of the run associated to d. □ 

Lemma B.3. Let c\ = (g,si),c 2 = ((/,s 2 ) be configurations such that s 2 = pop 2 fc (si) and 
there is a run from c\ to c 2 that passes a substack of s 2 only in its final configuration. Then 
there is an accepting run of Ar<= on Enc(ci) <8> Enc(c 2 ). 

Proof. Let p be some run from c\ to c 2 . Recall the decomposition p = pi o p 2 ° • • • ° Pn 
provided by Lemma [4.171 Let po := pt[o,o] an< ^ ^ 1i denote the final state of pi for all 
< i < n. In the following, we will use the notation d := LStck(d, c\) for all d G {0, 1}*. 
Let d be the rightmost leaf of Enc(ci) <S> Enc(c 2 ). Note that 

d = si = p(0) = po(0) = / o (length(p )) 

whence po ends in (go, d). We define an accepting run pr<= of Ar<= on T := Enc(ci) (g) Enc(c 2 ) 
by induction as follows. Let d G T be the lexicographically maximal node of Enc(ci) such 
that pn<= has not been defined yet at d. Assume that there is some maximal i > 1 such 
that pi(0) = (qi-i,s) for some stack s satisfying pop 2 (d) < pop 2 (s) and top 2 (d) = top 2 (s). 
Furthermore, assume that s = d if p%-\ is not of the form Fj3l Depending on the form of pi, 
we proceed as follows. 

(1) If pi is of the Form FtH let d' be the minimal element such that d = d'0 m for some 
m G N. We set pn^(d) := (R, Sym(j), CLvl(d), Rt(j)) and for d' < e < d we set 
PR^(e) ■= (R, g e ,%,Sym(e),CLvl(e),Rt(e)) where q e = ir 2 (p R <=(ej)) for j = max{i G 
{0,1} : di G Enc(ci)}. 

(2) If pi is of the Form F21 let e' be minimal such that d = e'O mo l mi for some mo, mi G N. 
For each e satisfying e'0 m ° < e < d and for all eO < / G Enc(ci) we set pn<=(e) := 
(C,?i_i,gi,Sym(e),CLvl(e),Rt(e)) and pn<=(f) := q%. 

For all e with e' < e < e'0 m ° we define pn^(e) := (R, q e , qi, Sym(e), CLvl(e), Rt(e)) 
where q e is defined as in the previous case. 

(3) If pi is of the Form F2J then we proceed as follows. 

• If d is a leaf of Enc(ci), set pr<= (d) := (P, Sym(<i), CLvl(<i), Rt(d)); 

• otherwise, let j G {0, 1} be maximal such that dj G Enc(ci). Then we set pn^(d) := 

(P, 7T 2 (PR-* iej)) , q u Sym(d), CLvl(d), Rt(d)) . 
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In case that there is some e G {0, 1}* such that d = el, then define pn^(e0f) := for 
all / G {0, 1}* such that eOf G Enc(ci). 

These rules define pr<= on Enc(ci) \ Enc(c2). Let d be the rightmost leaf of Enc(c2), and 

set pR^(d) := (^S, q n , q n , Sym(d), CLvl(d),Rt(d)^ . Let < d be the maximal element in 

the rightmost branch of Enc(c2) such that pn<= (d) is undefined. Let j G {0, 1} be maximal 

such that dj G Enc(ci). We set pR^(d) := (^S, 7t 2 (/9r^ (dj)), q n , Sym((f), CLvl(d), Rt(J)^ . 

We complete the definition by pn^(e) = qi and PR^(d) := q= for all d G Enc(c2) that are 
not in the rightmost branch of Enc(c2). A tedious, but straightforward induction shows 
that pr<= is an accepting run of Ar^ on Enc(ci) (g) Enc(c2). □ 



Appendix C. Automaton for Relation 

In the following definition, w ranges over words, r over letters from E \ {J-}, k over {1, 2}, 
Qi, Qe, q' e over Q and z, z' over {g = , _L}. Whenever we have fixed a word w, then a := Sym(u;), 
I := CLvl(iu) and x ranges over {(a, l),e}. 

Definition C.l. A := (Q.4, E.4, _L, {g/}, A_4) where 

• (Qu(E x {l,2})U{e,D}) 2 , 

• Qa ■= {qi,-L,q=, (□,£)} U (Q x Q x 2^ x< ^ x E x {1,2} x {£, Pi, P 2 }), and 

• A^4 contains the following transitions: 

(a) (gj, (gi, g 2 ), (gi, 92, 0, -L, 1, S 1 ), L); 

(b) (g=, (y, y), z, z') for all y G (E x {1, 2}) U {e}; 

(c) ((□,£), (□,£), ±,±); 

now fix an arbitrary q = (gj, g e , Rt(pop 1 (u;)), Sym(w), CLvl(w), S 1 ). A_4 contains 

(a) (g, (x,x),q ,±) for each g = (qi, Qe, Rt(w), r, k, S); 

(b) (g, (x,x),z,g); 

(c) (g, (x,x),_L,_L) if qi = g e ; 

(d) (q,(x,x),qo,±) and (q, (x, x), gj, (□, e)) for g^ = (qi,q' e ,Rt(w),T,k, Pj) such that 
there is some q € Q with (q' e ,q) G hLp(wrfc) and (g, r, g e , ColPop fc ) G A; 

now fix an arbitrary g = (gj, g e , Rt(pop 1 (w)), Sym(w), CLvl(io), P 2 ). A_4 contains 

(a) (g, (x,D),±,±) if qi = g e ; 

(b) (g, (x, □), go, -L) for go = (gi, g e , Rt(w), r, A;, P 2 ) such that there is a g G Q with 
(g^g) G hLp(wr fc ) and (g, r, g e , ColPop*,) G A; 

now fix an arbitrary g = (gj, g e , Rt(pop 1 (w)), Sym(w), CLvl(u;), Pi). A .4 contains 

(a) (g, (x, x), go, _L) for go = (gj, g e , Rt(iu), r, /c, Pi) such that there is a g G Q with 
(g e ,g) G hLp(wr fc ) and (g, r, g e , ColPop^) G A; 

(b) (g, (x,x),z,g); 

(c) (g, (x,x),z,qi) for g x = (g i? g e ,Rt(popi (w)),a,l,P 2 ). 

Let us explain the use of the flags S ('Searching the rightmost leaf of the second input'), 
Pi and P2 ('Pop sequence'). Let c = (q,s) and d = (p,t) be configurations such that 
t = pop 1 fc (s). Then we can always define nodes d\,d 2 ,d^ (and an auxiliary node 63) in the 
convolution of Enc(c) <8>Enc(c') as follows. Let d^ be the rightmost leaf of Enc(c), let be 
the rightmost leaf of Enc(c'), let d 2 be the minimal node of the rightmost path of Enc(c) 
which is not in Enc(c') \ {63} and let d\ be the maximal node of the rightmost path of Enc(c) 
which is on the rightmost path of Enc(c'). See Figure H] for an example. By definition one 



50 



A. KARTZOW 



t 

d, d e, □ 
I t 

t t 

c, c — e 

b, b; 1 □, e 

t 

a, a 

Figure 4: Nodes d\,d2, d% in case of s = a6cc : abcdd : abode f and i = abac : abcdd : ab are 
marked by boldface numbers 1,2,3, respectively. 

concludes that d\ < d% < d%. An accepting run of A labels all nodes up to d\ with flag 
S, the nodes strictly between d\ and c?2 with P\ and the nodes between di and c?3 by Pi- 
Using these flags the automaton guarantees that t = pop 1 fc (s) for some s. Furthermore the 
transitions used at the nodes labelled by Pi or P2 guarantee that there is a sequence of 
loops and pop operations connecting the two configurations. 

Lemma C.2. Let c\ and 02 be configurations. A R a accepts Enc(ci) ® Enc(c2) if and only 
if c 2 = pop! m (ci) such that (01,02) € R^- 

Proof (sketch). Assume that p is an accepting run of A R ij. on Enc(ci) ® Enc(c2). 

Every accepting run labels the root by qj. Now an easy induction shows that there are 
nodes < d\ < c?2 < ^3 such that the following holds. 

• c?3 is the rightmost leaf of Enc(ci). 

• All nodes < e < d% are labelled by elements in 

M :=QxQx 2 QxQ x S x {1,2} x {S,P 1 ,P 2 } 



such that ~K%(e) 



S if0<e<di, 
Pi if d\ < e < d 2 , 
P 2 if d 2 <e<d 3 . 

Furthermore, all nodes in Enc(ci) ® Enc(c2) to the left of this branch are labelled by q = 
which ensures that the two configurations agree on these nodes. We distinguish the following 
cases: 

di = c?2 = c?3 In this case, no transition of the form (j?d|) is used in the run. One easily 
concludes that a = 02 and (01,02) € R^ is witnessed by the run of length 
connecting c\ with 02- 

d\ = c?2 < c?3 In this case, we prove by induction from d\ to d 3 that the automaton uses at 
d\ a transition of the first form of (l?dj) . between d\ and d% it uses transitions 
from (l?bj) , and at dj, it uses a transition from (l?a|l . Due to Lemma 15.81 (first 
case) this implies that 02 = pop 1 m (ci) for some m € N and LStck(di, c\) is the 
stack of C2. Now, by induction from d% to d\ one proves for each d\ < e < d^ 
that iTi(e) is the state of c\ and ^(e) is a state such that there is a run 
witnessing (c\, (^(e), LStck(e, c\))) € R^. We conclude by another induction 
showing that "^(e) is the state of 02- 
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di < di < ds Analogously to the previous case, we use Lemma 15.81 (second case) to show 
that C2 = pop 1 m (ci) for some m G N. Induction from d^ to d\ shows that for 
each d\ < e < d% there is some number k(e) < m such that 

top 2 (LStck(e,ci)) = top 2 (po Pl fe(e) (ci)) 

and the label of e is such that there is a run from c\ to (7T2(e),pop 1 fc ^ e ^(ci)) 
witnessing that this pair is in W~. Moreover, k(d\) = m and 7r2(di) is the 
state of C2 whence (01,02) G R^. 
For the other direction let c\ = (qi,s\),C2 = (92,^2) be configurations and p a run that 
witnesses (01,02) G R^. We only consider the case that Enc(ci) <S> Enc(c 2 ) is as described 
in ([2]) of Lemma 15.81 The other case is similar. Let b G Enc(c2) be such that 61 is the 
rightmost leaf of Enc(c2). Let c be maximal in Enc(c2) such that cl G Enc(ci) \ Enc(c2). 
Let d be the rightmost leaf of Enc(ci). Due to Lemma EBJ b < c < d. 

For all x < d, let w x := top 2 (LStck(x, Enc(ci))). For b < x < d let i x G dom(p) be 
minimal such that p(i x ) = (9, pop 2 (si) : w x ). Let q x G Q be the state at p(i x ). We define 
an accepting run A of Ard- on Enc(ci) (8) Enc(c2) as follows. For all e < x < d let 

A(x) := (gi,/ a; ,Rt(pop 1 («; a; )),Sym(«; a .),CLvl(ii; a! ),l^.) where 



f 92 


if x < 6, 




if 6 < x < d, 




ii x <b, 


< -Pl 


if 6 < x < c, 


,^ 


if c < x < d 



and 



Furthermore, we set A(61) := (□,£), A(e) := qi and X(x) := (7= for all other nodes 
x G Enc(ci) ® Enc(c2). 

Since /? decomposes as a sequence of loops, pop x operations and collapse operations of 
level 1 , it is straightforward to show that A is an accepting run of A R a. . □ 



Appendix D. Automaton for Relation Rr" 

In the following definition, we use the same convention regarding ranges of variables as in 
Appendix [Bj 

Before we define the automaton recognising the relation formally, we explain how 
a successful run of it will process a tree Enc(gi, s\) <g> Enc((?2 ) S2). The states of Ar^ come 
from the set {qi, q=, _L} U M where 

M:= Q x Q x S x {1,2} x (2 QxQ ) x (2 QxQ ) x {R,L} x {S, N}. 

qi is the final state that is exclusively used to label the root. q = is the state for all nodes 
in Enc(gi, s\) that do not belong to the rightmost branch of this tree. This state is used to 
check that Enc(gi, s\) and Enc^, S2) agree on this part of the convolution. The state L is 
the initial state only used for marking the end of the tree, i.e., _L is the label for the nodes 
in (Enc(gi, s\) ® Enc(g2, ^2))+- The rest of the nodes are labelled by elements from M. 

For some q G M we write iTi(q) for the projection to the i-th component. In an accepting 
run, a node d is labelled by q G M if the following is satisfied. 
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CI. 7T8(q) = S iff d is in the rightmost branch of Enc(gi,si). S stands for "searching the 

rightmost leaf of (gi,si)" while N stands for "normal reachability". We ensure that 

tts(q) = N iff d is in Enc(^ 2 , s 2 ) \ Enc((/i, s\). 
C2. TTyiq) = R iff d is in the rightmost branch of Enc((/2, s 2 ) (by definition this is also the 

rightmost branch of Enc(gi, si) (g) Enc(q 2 , S2). i? stands for "rightmost branch" while 

L stands for "left". 

C3. iTe(q) = Lp(LStck((i, Enc(g2 5 52)))- Since si G MS(s2), d G Enc(gi,si) implies that 

TTe(q) = Lp(LStck(d, Enc(gi, si))). 
C4. 7r 5 (q) = Rt(LStck(ci, Enc(g 2 ,S2)))- Since si G MS(s 2 ), (i G Enc(gi,si) implies that 

7T5(g) = Rt(LStck(d, Enc(gi, si))). 
C5. ir 4 (q) = CLvl(LStck(ci,Enc(g2,S2))). 
C6. Tr 3 (q) = Sym(LStck(ii, Enc(q 2 , s 2 ))). 

C7. Let qi := TTi(q) and c/ e := ^2(5). For d G Enc((/2,S2) \ Enc((7i,si) there is a run from 
((/i, LStck(<i, Enc(g2, S2))) to (q e , IgM(<i, Enc(q 2 , ^2))). If d is in the rightmost path of 
Enc(gi, si), qi = q\ and there is a run from (qi, Si) to (g e , IgM(d, Enc(g2, S2))). 

Definition D.l. Let S = (Q,Y,,T, A$,qo) be some CPS. Define the automaton Ar^ as 
follows. The set of states is contained in {qj, q = , ±} U M where 

M := Q x Q x £ x {1, 2} x (2 QxQ ) x (2 QxQ ) x {i?, L} x {S, N}. 

_L is the initial state and qj is the only final state. The transition relation A^ of A contains 
the following transitions. 

Tl. (q r , (q 1 , q 2 ), (qi,q 2 , ±, 1, Rt(-L), Lp(_L), R, S), _L) G for all pairs (qi,q 2 ) G Q 2 , and 
T2. (g = , (y, y),X, Y) for X, Y G {q=, _L} and y G (S x {1, 2}) U {e}. 
Fix some g := (q\,q 2 ,a,l, Rt(w),Lp(w), R, S). Then we add the following transitions. 
T3. (q,(x,x),±,±) if gi =q 2 ; 
T4. (g,(x,x),±,gi) for 

(<7i,o-,7,<Lclone 2 ) G A 5 , 
(q,q[) G Lp(w), and 
?i = (9i,92,o-,/,Rt(u;),Lp(«;),i2,JV); 
T5. (g,(x,z),X,g) for* G {<?=,!.}; 

T6. (<?, (x,x),g ,-L) for qo ■= {qi, q 2 ,T, k,Rt(wT k ),Lp(wT k ), R, S); 
T7. (q, (x,x),q ,qi) for 

<?o := {qi,q' 2 ,r,k,Rt{wT k ),Lp(wT k ),L,S), 
(g 2 ,r,7,g,ColPop fc ) G A 5 , 

G LpH, and 
?i : = (9 , i 1 ?2,ff,l,Rt(t«),Lp(w),i?,JV); 
Fix some g := (gi, 52, c, /, Rt(u;), Lp(u>), L, S). Then we add the following transitions. 
T8. (q,(x,x),±,±) for (q ± , a, 7, g, clone 2 ) £A S and (g,g 2 ) S Lp(ro); 
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T9. (q, (x,x),qo,±.) for 

9o ■= (<?i, <?2 ; ^ Rt(wT fc ), Lp(wT- fc ), i, S 1 ), 

(92)T)9,ColPop fc ) G A 5 and 

(q,q 2 ) G hp(w); 

T10. (g,(x,x),X,g)forX€{g=,±}; 
Til. for 

(9i,o",7)9,clone 2 ) € A 5 , 

(?) 9i) € Lp(iy) and 

9i : = (9i,92,cr,Z,Rt(«;),Lp(«;),L,iV); 

T12. (q, (x,x),q ,qi) for 

g := {qi,q' 2 ,T,k,Rt(wTk),Lp(wT k ),L,S), 

(g 2 ,T,g,ColPop fc ) G A 5 , 

(?) Qi) €= Lp(«;) and 

?i : = ill, 12, v, l,Rt(w),Lp(w), L, N). 

Fix some g := (gi, g 2 , <?, I, Rt(w),Lp(w), R, N). Then we add the following transitions. 

T13. (q,(D,x),±,±) Hqi = q 2 ; 
T14. (q,(D,x),q ,±) for 

(9i,0",7,9,push rfe ) G A 5 , 

(g,gi) G Lp(ifTfc) and 

9o ■= (q'i,q 2 ,T,k,Rt(wT k ),Lp(wT k ),R,N); 

T15. (g, (□, x),±, qi) for 

(9i,0",7,9,clone 2 ) G A s , 

(9, 9i) € Lp(u>) and 

ffi : = (9i,92,^,^Rt(w),Lp(u;),i?,iV); 

T16. (q, (D,x),q Q ,qi) for 

(9i,o-,7,9,Pushr,fc) G As, 
(g,gi) G Lp(u>r fc ), 

90 := (o , i,Q , 2 5 ' r : fc, Rt(xt;T fc ), Lp(wT fc ), -L, AT), 
(g 2 ,T,g',ColPo Pfc ) G A 5 , 

(9') 9i) £ Lp(iy) and 

91 := (9i , 92, cr, /, Rt(w), Lp(w), J?, AT). 

Fix some 9 := (91, 92, c, /, Rt(u;), Lp(iy), L, iV). Then we add the following transitions. 
T17. (9, (D,x),±,±) for (gi,cr,7,g,clone 2 ) G A 5 and (g,g 2 ) G Lp(u>); 
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T18. (?,(□, x),gb,_L) for 

(9i,o",7,9,PUsli Tifc ) G A 5 , 

(g,gi) G Lp(y;T fc ), 

:= (gi,g 2 ,r, A;,Rt(wT fc ),Lp(u;7- fc ),L, AT), 

(g2' T J ( /'CorPop fe ) G A5 and 

(g',g 2 ) G Lp(w); 
T19. (g, (□, x),±, qi) for 

(gi,cr,7,g,clone 2 ) G A 5 , 

(g,gi) G Lp(ui), and 

gi := (g'i,g2,o-,^Rt(w),Lp(w),L,iV); 

T20. (g, (□,*), g , gi) for 

(gi,cr,7 ; 9,Push Tjfc ) G A 5 , 
(g,gi) G Lp(wT k ), 

go := (<li,<l2, T ,k,Rt(wTk),Lp(wTk)),L,N), 
(g 2 ,T,g',ColPop fc ) G A 5 , 
(g'> Qi) ^ Lp(w) and 

gi := (gi,g2,o-,/,Rt(w),Lp(w),L,iV). 

The next lemma is a first step towards the proof that any accepting run of Ar^ on a tree 
Enc(ci) ® Enc(c2) witnesses the existence of some run from c\ to c 2 . 

Lemma D.2. Let S be some CPS. Let (gi, si), (g 2 , s 2 ) »e configurations and let p be an 
accepting run on T := Enc(gi,si) <S> Enc(g 2 ,s 2 ). T/ien Conditions 0J\-QM of the beginning 
of this section hold and si G MS(s 2 ). 

The proof consists of straightforward inductions. 

Lemma D.3. Let S be some CPS. Let (gi, s\), (g 2 , s 2 ) oe configurations and let p be an 
accepting run on T : = Enc(gi,si) ® Enc(g 2 , s 2 ). Let T\ := dom(T) \ dom(Enc(q\, s\)) and 
T 2 be the rightmost branch o/Enc(gi,si) without the root. Furthermore, for all d G T\, let 
Sd := LStck(d, Enc(g 2 , s 2 )) and for each d G T 2 , Ze£ Sd := sx- -For eac/i d G Ti U T 2 we Ziawe 
p(a!) G M and there is a run p$ of S from (iri(p(d)), Sd) to (7r 2 (p(o!)), IgM(d, Enc(g 2 , s 2 ))) 
such that for all < i < length^s), ps(i) 7^ s\. 

Proof. The proof is by induction starting at the leaves. The base cases are the following. 

• Assume that si = s 2 . Due to Cd]and C2J the rightmost leaf doiT satisfies (nj(d), irsid)) = 
(R, S). Thus, p applies at d some transition of the form Due to the existence of this 
transition, we conclude that ni(p(d)) = 7r 2 (p(d))). Since Sd = IgM(o!, Enc(g 2 , s 2 )) = si, we 
conclude there is a loop of length from (iri(p(d)), Sd) to (7r 2 (/j(a!)), IgM(<i, Enc(g 2 , s 2 ))). 

• Assume that si 7^ s 2 . Let d be the rightmost leaf of T, i.e., d is the rightmost leaf 
of Enc(g 2 ,s 2 ). Due to CCD and Q21 (717(0!), vrs(d)) = (R,N). Thus, p applies a tran- 
sition of the form Tfl3l whence -iri(p(d)) = 7r 2 (/9(d)). As in the previous case we ob- 
tain Sd = IgM(d, Enc(g 2 , s 2 )) = s 2 and a run of length connects (-Ki(p(d)), Sd) with 
(ft2(p(d)), IgM(d, Enc(g 2 , s 2 ))) because the two configurations agree. 



TREE-AUTOMATICITY OF 2-CPG 



55 



Now let d € T\ be a leaf of T that is not in the rightmost branch. Due to CQ] and 
C[2l (vr7((i), 7Tg((i)) = (L,N). Thus, p applies a transition of the form T fTTl Note that 
Sd = LStck(ci, Enc((/2, S2)) and IgM((Z, Enc(g2, S2)) = clone2(sd). Due to the conditions on 
the existence of a transition of form T fTTl one immediately concludes that there is a run 
from (<iri(p(d)),s d ) to (7r 2 (p(d)),IgM(d,Enc(g 2) s 2 ))). 

Now let d be the rightmost leaf in T that is in Enc(gi, si). Since d is not in the rightmost 
branch of T and due to CfTJand (J2j (^(cZ), 7rg (cZ)) = (L, S). Thus, p applies a transition of 
the form TfSl Note that Sd = LStck(d, Enc((/2, S2)) and IgM((f, Enc((/2, S2)) = clone2(sd). 
Due to the conditions on the existence of a transition of form TfS] one immediately con- 
cludes that there is a run from (7Ti(p(d)), s d ) to (7r 2 (/9(d)), IgM(d, Enc((/2, ^2))). 
Note that all the runs obtained in the base cases do not visit the stack s\ except for the 
first configuration in the run associated to the rightmost leaf of Enc(gi, s\). 

Analogously to the base case, the inductive step consists of a lengthy but rather straight- 
forward case distinction. Instead of stating all cases, we mention the crucial ideas underlying 
the proof. 

• For d € {0, 1}* and i £ {0, 1} such that di is in the rightmost branch of Enc^i, si), then 
Sd = Sdi = s\. Thus, the run associated to di serves as initial part of the run associated 
to d. 

• For d € Enc((/2, S2) \ Enc(gi,si) or d the rightmost leaf of Enc((/i,si), let i € {0,1} 
minimal such that di € Enc((/2> s 2 ). Then LStck(d, (g 2 , S2)) and LStck(di, (q2, S2)) differ 
in one stack operation op. The transition of Ar^ used at d ensures that there is a 
run from (ni(p(d)), LStck(<i, (</2jS2)) to (iri(p(di)), LStck(<ii, (52, S2))) that performs this 
operation op followed by a loop. The composition of this run with the run associated to 
di serves as initial part of the run associated to d. 

• If d is in the rightmost branch of Enc(g2, s 2) and i 6 {0,1} is maximal such that di € 
Enc((/2, S2), then lgM(d, Enc((?2, S2)) = IgM(<ii, Enc(g2, S2)) whence the run associated to 
di serves as final part of the run associated to d. 

• If dl € Enc(g2,S2), then IgM(<i, Enc(g2j £2)) = IgM(c£l, Enc(g2, £2)) whence the run asso- 
ciated to dl serves as final part of the run associated to d. 

• If d is not in the rightmost branch of Enc^j^) an< ^ dl ^ Enc^j S2), then we have 
IgM(d, Enc((/2, S2)) = pop 1 (IgM(dO, Enc(g2j S2))). Furthermore, the transition of Ar^ 
used at d ensures that there exists a run from (^(^(dO)), IgM(<iO, Enc((/2, $2)) to (-K2(p(d)), 
IgM(cZ, Enc(g2, £2)))- This run serves as final part of the run associated to d. 

• If dO, dl € Enc(g2,S2), then LStck(dl, (q 2 , s 2 )) = pop 1 (IgM(dO, Enc(g 2 , s 2 ))). Further- 
more, the existence of the transition of Ar^? used at d ensures that there is a run from 
(7r 2 (p(d0)),IgM(d0,Enc(g' 2 ,s 2 ))) to (-Ki{p(dl)), LStck(dl, (q 2 , s 2 ))). This run is used to 
connect the initial part induced by dO with the final part induced by dl in order to obtain 
the run associated to d. □ 

Remark D.4. Due to the transitions of the form TfTJ any accepting run of Ar^- on a 
tree Enc(gi,si) <g) Enc(g2> $2) satisfies (7Ti(/j(0)), 7T2(p(0))) = (q%, (72). Moreover, recall that 
IgM(0, Enc(g2 ; S2)) = «2- Thus, the lemma implies that ((^l, si), (g 2 , s 2 )) € Br* if there is 
an accepting run of Ar^ on Enc((?i, si) <g> Enc(g 2 , 82)- 

Lemma D.5. Let S be some CPS. Let c\ := (</j.,Si), C2 := (<Z2) S 2) ^ e configurations such 
that s\ = pop 2 fc (s2) for some k € N. Let p$ be a run from ((?i,si) to (<?2; S2) witnessing 
(ci, C2) £ Bf* . Then there is an accepting run of A on T := Enc(gi, si) ® Enc((/2, £2)- 
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Proof. We define the accepting run p as follows. Set p(e) := qi, p(d) := _L for all d G T+, 
p(d) := q= for all d G Enc(gi,sx) \ 5 where B is the rightmost branch of Enc(gi,si), and 
for all other d G dom(T), set 

P(d) := (gf^.SymCaO.CLvlCaO.RtCa'J.LpCaO.^y) 

where s' := LStck(d, Enc(g2, S2)) if d B and s' = si if d G -B and qf,q^,X and Y are 
defined as follows. 

• qf = q\ if d G Enc(gi,si). Otherwise let j G dom(ps) be maximal such that ps(i) = 
(g, LStck(d, Enc(g2, S2))) for some q G Q. Set gf := g. 

• Let j G dom(ps) be maximal such that ps(j) = (g, IgM(d, Enc(g2, S2))) for some q G Q. 
Set gf := g. 

• Set X = R if d is in the rightmost branch of T and set X = L otherwise. 

• Set Y = S if d is in the rightmost branch of Enc(gi, s\) and set Y = N otherwise. 

A straightforward, but tedious induction shows that p is accepting on T. It relies on the 
decomposition result for runs witnessing (01,02) G Rf^ from Corollary 14.101 O 



Appendix E. Modifications for the Proof of Proposition 13.91 (cf. page 1361) 

We replace the automaton Ar^ in the construction of Reach on the product of the pushdown 
system with the automaton for the regular language L with the following version Ar^>. Let 
Ql be the states of a finite automaton recognising L, iq G Ql be its initial state and F C Qi 
be its final states. We replace transitions of the form (gj, (gi, g2), (S, gi, g2, -L, 1, Rt(i_2)), -L) 
by transitions of the form (gj, (gi,g2), (S, (qi,io), (g2,g),-L, 1, Rt(J_ 2 ))) -L) for gi,g 2 G Q and 
q € Ql- Constructing 99 with Ar<=/ instead of Ar<= ensures that T\ encodes some configu- 
ration (g, s) where the state g is in Q, but it checks for runs starting in ((g, zq), s). 

Furthermore, we replace the automaton Ar^ with the version -4r^' where we replace 
the transitions of the form (gj, (gi, g2), (gi, g2, -L, 1, Rt(_L), Lp(_L), R, S), _L) G A_4 with the 
transitions (g/, (gi, g 2 ), ((gi, q), (g 2 , qf), ±, 1, Rt(J_),Lp(_L), R, S), J_) G for gi,g 2 G Q, 
q G Ql and g/ G F. Constructing ip with instead of -4r=> ensures that T2 encodes 

some configuration (g, s) where the state g is in Q, but it checks for runs ending in ((g, qA, s) 
for some final state g/ of Al- 
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